emanuelaboros commited on
Commit
6af1324
·
verified ·
1 Parent(s): 97efbd9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -16
README.md CHANGED
@@ -10,13 +10,6 @@ tags:
10
 
11
  The **Impresso NER model** is based on the stacked Transformer architecture published in [CoNLL 2020](https://aclanthology.org/2020.conll-1.35/) trained on the Impresso HIPE-2020 portion of the [HIPE-2022 dataset](https://github.com/hipe-eval/HIPE-2022-data). It recognizes entity types such as person, location, and organization while supporting the complete [HIPE typology](https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-hipe2020.md), including coarse and fine-grained entity types as well as components like names, titles, and roles. Additionally, the NER model's backbone ([dbmdz/bert-medium-historic-multilingual-cased](https://huggingface.co/dbmdz/bert-medium-historic-multilingual-cased)) was trained on various European historical datasets, giving it a broader language capability. This training included data from the Europeana and British Library collections across multiple languages: German, French, English, Finnish, and Swedish. Due to this multilingual backbone, the NER model may also recognize entities in other languages beyond French and German.
12
 
13
-
14
- ## Model Details
15
-
16
-
17
- <!-- Provide a quick summary of what the model is/does. -->
18
- dbmdz/bert-medium-historic-multilingual-cased
19
-
20
  #### How to use
21
 
22
  You can use this model with Transformers *pipeline* for NER.
@@ -47,15 +40,15 @@ print(entities)
47
 
48
  ```
49
  [
50
- {'type': 'time.date.abs', 'confidence_ner': 85.0, 'index': (0, 5), 'surface': "En l'an 1348", 'lOffset': 0, 'rOffset': 12},
51
- {'type': 'loc.adm.nat', 'confidence_ner': 90.75, 'index': (19, 20), 'surface': 'Europe', 'lOffset': 69, 'rOffset': 75},
52
- {'type': 'loc', 'confidence_ner': 75.45, 'index': (22, 25), 'surface': 'Royaume de France', 'lOffset': 80, 'rOffset': 97},
53
- {'type': 'pers.ind', 'confidence_ner': 85.27, 'index': (44, 47), 'surface': 'roi Philippe VI', 'lOffset': 181, 'rOffset': 196, 'title': 'roi', 'name': 'roi Philippe VI'},
54
- {'type': 'loc.adm.town', 'confidence_ner': 30.59, 'index': (51, 52), 'surface': 'Louvre', 'lOffset': 210, 'rOffset': 216},
55
- {'type': 'loc.adm.town', 'confidence_ner': 94.46, 'index': (60, 61), 'surface': 'Paris', 'lOffset': 266, 'rOffset': 271},
56
- {'type': 'pers.ind', 'confidence_ner': 96.1, 'index': (77, 81), 'surface': 'chancelier Guillaume de Nogaret', 'lOffset': 350, 'rOffset': 381, 'title': 'chancelier', 'name': 'chancelier Guillaume de Nogaret'},
57
- {'type': 'loc.adm.nat', 'confidence_ner': 49.35, 'index': (22, 23), 'surface': 'Royaume', 'lOffset': 80, 'rOffset': 87},
58
- {'type': 'loc.adm.nat', 'confidence_ner': 24.18, 'index': (24, 25), 'surface': 'France', 'lOffset': 91, 'rOffset': 97}
59
  ]
60
  ```
61
 
 
10
 
11
  The **Impresso NER model** is based on the stacked Transformer architecture published in [CoNLL 2020](https://aclanthology.org/2020.conll-1.35/) trained on the Impresso HIPE-2020 portion of the [HIPE-2022 dataset](https://github.com/hipe-eval/HIPE-2022-data). It recognizes entity types such as person, location, and organization while supporting the complete [HIPE typology](https://github.com/hipe-eval/HIPE-2022-data/blob/main/documentation/README-hipe2020.md), including coarse and fine-grained entity types as well as components like names, titles, and roles. Additionally, the NER model's backbone ([dbmdz/bert-medium-historic-multilingual-cased](https://huggingface.co/dbmdz/bert-medium-historic-multilingual-cased)) was trained on various European historical datasets, giving it a broader language capability. This training included data from the Europeana and British Library collections across multiple languages: German, French, English, Finnish, and Swedish. Due to this multilingual backbone, the NER model may also recognize entities in other languages beyond French and German.
12
 
 
 
 
 
 
 
 
13
  #### How to use
14
 
15
  You can use this model with Transformers *pipeline* for NER.
 
40
 
41
  ```
42
  [
43
+ {'type': 'time.date.abs', 'confidence_ner': 85.0, 'surface': "En l'an 1348", 'lOffset': 0, 'rOffset': 12},
44
+ {'type': 'loc.adm.nat', 'confidence_ner': 90.75, 'surface': 'Europe', 'lOffset': 69, 'rOffset': 75},
45
+ {'type': 'loc', 'confidence_ner': 75.45, 'surface': 'Royaume de France', 'lOffset': 80, 'rOffset': 97},
46
+ {'type': 'pers.ind', 'confidence_ner': 85.27, 'surface': 'roi Philippe VI', 'lOffset': 181, 'rOffset': 196, 'title': 'roi', 'name': 'roi Philippe VI'},
47
+ {'type': 'loc.adm.town', 'confidence_ner': 30.59, 'surface': 'Louvre', 'lOffset': 210, 'rOffset': 216},
48
+ {'type': 'loc.adm.town', 'confidence_ner': 94.46, 'surface': 'Paris', 'lOffset': 266, 'rOffset': 271},
49
+ {'type': 'pers.ind', 'confidence_ner': 96.1, 'surface': 'chancelier Guillaume de Nogaret', 'lOffset': 350, 'rOffset': 381, 'title': 'chancelier', 'name': 'chancelier Guillaume de Nogaret'},
50
+ {'type': 'loc.adm.nat', 'confidence_ner': 49.35, 'surface': 'Royaume', 'lOffset': 80, 'rOffset': 87},
51
+ {'type': 'loc.adm.nat', 'confidence_ner': 24.18, 'surface': 'France', 'lOffset': 91, 'rOffset': 97}
52
  ]
53
  ```
54