File size: 1,592 Bytes
ae5d239
43f94db
ae5d239
43f94db
f15e982
 
 
 
 
ae5d239
b550718
 
 
 
 
 
 
 
 
 
 
 
7cfffaf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f15e982
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
language: et
license: cc-by-sa-4.0
inference: false
base_model:
- EMBEDDIA/est-roberta
pipeline_tag: token-classification
tags:
- NER
---

# est-roberta-hist-ner

## Model description 

est-roberta-hist-ner is an [Est-RoBERTa](https://huggingface.co/EMBEDDIA/est-roberta) based model fine-tuned for named entity recognition in Estonian 19th century parish court records (for details, see [this repository](https://github.com/soras/vk_ner_lrec_2022)). 
The following types of entities are recognized: person names (PER), ambiguous locations-organizations (LOC_ORG), locations (LOC), organizations (ORG) and MISC (miscellaneous names).

## How to use 

Recommended usage of the model is with approriate pre- and postprocessing by EstNLTK. 
For an usage example, see this tutorial: [https://github.com/soras/vk\_ner\_lrec\_2022/blob/main/using\_bert\_ner\_tagger.ipynb](https://github.com/soras/vk_ner_lrec_2022/blob/main/using_bert_ner_tagger.ipynb) 

## Citation

If you use this model in your work, please cite us as follows:
	
	@InProceedings{orasmaa-EtAl:2022:LREC,
	  author    = {Orasmaa, Siim  and  Muischnek, Kadri  and  Poska, Kristjan  and  Edela, Anna},
	  title     = {Named Entity Recognition in Estonian 19th Century Parish Court Records},
	  booktitle      = {Proceedings of the Language Resources and Evaluation Conference},
	  month          = {June},
	  year           = {2022},
	  address        = {Marseille, France},
	  publisher      = {European Language Resources Association},
	  pages     = {5304--5313},
	  url       = {https://aclanthology.org/2022.lrec-1.568}
	}