Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,56 @@
|
|
1 |
-
---
|
2 |
-
license: cc-by-4.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-nc-4.0
|
3 |
+
language:
|
4 |
+
- fr
|
5 |
+
pipeline_tag: token-classification
|
6 |
+
widget:
|
7 |
+
- text: >-
|
8 |
+
* ALBI, (Géog.) ville de France, capitale de l'Albigeois, dans le haut
|
9 |
+
Languedoc : elle est sur le Tarn. Long. 19. 49. lat. 43. 55. 44.
|
10 |
+
- text: >-
|
11 |
+
HILPERHAUSEN, (Géog.) ville d'Allemagne en Franconie, sur la Werra, au comté de Henneberg, entre Cobourg & Smalcalde ; elle appartient à une branche de la maison de Saxe-Gotha. Long. 28. 15. lat. 50. 35. (D. J.)
|
12 |
+
---
|
13 |
+
|
14 |
+
# bert-base-french-cased-edda-ner-levels
|
15 |
+
|
16 |
+
|
17 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
18 |
+
|
19 |
+
This model is designed to identify and classify Named Entity Recognition with the prefix IOB2.
|
20 |
+
It has been trained on the French *Encyclopédie ou dictionnaire raisonné des sciences des arts et des métiers par une société de gens de lettres (1751-1772)* edited by Diderot and d'Alembert (provided by the [ARTFL Encyclopédie Project](https://artfl-project.uchicago.edu)).
|
21 |
+
Dataset: [https://huggingface.co/datasets/GEODE/GeoEDdA](https://huggingface.co/datasets/GEODE/GeoEDdA)
|
22 |
+
|
23 |
+
|
24 |
+
## Class labels
|
25 |
+
|
26 |
+
<!-- Provide a list of tag detected by the model. -->
|
27 |
+
|
28 |
+
The NER detected by this model are:
|
29 |
+
- **NC-Spatial**: a common noun that identifies a spatial entity (nominal spatial entity) including natural features, e.g. `ville`, `la rivière`, `royaume`.
|
30 |
+
- **NP-Spatial**: a proper noun identifying the name of a place (spatial named entities), e.g. `France`, `Paris`, `la Chine`.
|
31 |
+
- **ENE-Spatial**: nested spatial entity , e.g. `ville de France`, `royaume de Naples`, `la mer Baltique`.
|
32 |
+
- **Relation**: spatial relation, e.g. `dans`, `sur`, `à 10 lieues de`.
|
33 |
+
- **Latlong**: geographic coordinates, e.g. Long. 19. 49. lat. 43. 55. 44.
|
34 |
+
- **NC-Person**: a common noun that identifies a person (nominal spatial entity), e.g. `roi`, `l'empereur`, `les auteurs`.
|
35 |
+
- **NP-Person**: a proper noun identifying the name of a person (person named entities), e.g. `Louis XIV`, `Pline`, `les Romains`.
|
36 |
+
- **ENE-Person**: nested people entity, e.g. `le czar Pierre`, `roi de Macédoine`
|
37 |
+
- **NP-Misc**: a proper noun identifying entities not classified as spatial or person, e.g. `l'Eglise`, `1702`, `Pélasgique`.
|
38 |
+
- **ENE-Misc**: nested named entity not classified as spatial or person, e.g. `l'ordre de S. Jacques`, `la déclaration du 21 Mars 1671`.
|
39 |
+
- **Head**: entry name
|
40 |
+
- **Domain-Mark**: words indicating the knowledge domain (usually after the head and between parenthesis), e.g. `Géographie`, `Geog.`, `en Anatomie`.
|
41 |
+
|
42 |
+
|
43 |
+
|
44 |
+
## Bias, Risks, and Limitations
|
45 |
+
|
46 |
+
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
47 |
+
|
48 |
+
This model was trained entirely on French encyclopedic entries and will likely not perform well on text in other languages or other corpora.
|
49 |
+
|
50 |
+
|
51 |
+
## Acknowledgement
|
52 |
+
|
53 |
+
|
54 |
+
The authors are grateful to the [ASLAN project](https://aslan.universite-lyon.fr) (ANR-10-LABX-0081) of the Université de Lyon, for its financial support within the French program "Investments for the Future" operated by the National Research Agency (ANR).
|
55 |
+
Data courtesy the [ARTFL Encyclopédie Project](https://artfl-project.uchicago.edu), University of Chicago.
|
56 |
+
|