Update README.md
Browse files
README.md
CHANGED
@@ -4,9 +4,10 @@ tags:
|
|
4 |
- token-classification
|
5 |
language:
|
6 |
- es
|
|
|
7 |
license: mit
|
8 |
model-index:
|
9 |
-
- name:
|
10 |
results:
|
11 |
- task:
|
12 |
name: NER
|
@@ -21,54 +22,26 @@ model-index:
|
|
21 |
- name: NER F Score
|
22 |
type: f_score
|
23 |
value: 0.6911764706
|
24 |
-
|
25 |
-
|
26 |
-
type: token-classification
|
27 |
-
metrics:
|
28 |
-
- name: POS (UPOS) Accuracy
|
29 |
-
type: accuracy
|
30 |
-
value: 0.0
|
31 |
-
- task:
|
32 |
-
name: MORPH
|
33 |
-
type: token-classification
|
34 |
-
metrics:
|
35 |
-
- name: Morph (UFeats) Accuracy
|
36 |
-
type: accuracy
|
37 |
-
value: 0.0
|
38 |
-
- task:
|
39 |
-
name: LEMMA
|
40 |
-
type: token-classification
|
41 |
-
metrics:
|
42 |
-
- name: Lemma Accuracy
|
43 |
-
type: accuracy
|
44 |
-
value: 0.0
|
45 |
-
- task:
|
46 |
-
name: UNLABELED_DEPENDENCIES
|
47 |
-
type: token-classification
|
48 |
-
metrics:
|
49 |
-
- name: Unlabeled Attachment Score (UAS)
|
50 |
-
type: f_score
|
51 |
-
value: 0.0
|
52 |
-
- task:
|
53 |
-
name: LABELED_DEPENDENCIES
|
54 |
-
type: token-classification
|
55 |
-
metrics:
|
56 |
-
- name: Labeled Attachment Score (LAS)
|
57 |
-
type: f_score
|
58 |
-
value: 0.0
|
59 |
-
- task:
|
60 |
-
name: SENTS
|
61 |
-
type: token-classification
|
62 |
-
metrics:
|
63 |
-
- name: Sentences F-Score
|
64 |
-
type: f_score
|
65 |
-
value: 0.0
|
66 |
---
|
67 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
68 |
|
69 |
| Feature | Description |
|
70 |
| --- | --- |
|
71 |
-
| **Name** | `
|
72 |
| **Version** | `1.0.0` |
|
73 |
| **spaCy** | `>=3.2.3,<4.0.0` |
|
74 |
| **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
|
@@ -96,18 +69,7 @@ This is a Spacy multilingual anonimization model, for use with BSC's Anonymizati
|
|
96 |
|
97 |
| Type | Score |
|
98 |
| --- | --- |
|
99 |
-
| `POS_ACC` | 0.00 |
|
100 |
-
| `MORPH_ACC` | 0.00 |
|
101 |
-
| `MORPH_PER_FEAT` | 0.00 |
|
102 |
-
| `DEP_UAS` | 0.00 |
|
103 |
-
| `DEP_LAS` | 0.00 |
|
104 |
-
| `DEP_LAS_PER_TYPE` | 0.00 |
|
105 |
-
| `SENTS_P` | 0.00 |
|
106 |
-
| `SENTS_R` | 0.00 |
|
107 |
-
| `SENTS_F` | 0.00 |
|
108 |
-
| `LEMMA_ACC` | 0.00 |
|
109 |
| `ENTS_F` | 69.12 |
|
110 |
| `ENTS_P` | 74.60 |
|
111 |
| `ENTS_R` | 64.38 |
|
112 |
-
| `
|
113 |
-
| `NER_LOSS` | 26573.78 |
|
|
|
4 |
- token-classification
|
5 |
language:
|
6 |
- es
|
7 |
+
- ca
|
8 |
license: mit
|
9 |
model-index:
|
10 |
+
- name: ca_anonimization_core_lg
|
11 |
results:
|
12 |
- task:
|
13 |
name: NER
|
|
|
22 |
- name: NER F Score
|
23 |
type: f_score
|
24 |
value: 0.6911764706
|
25 |
+
widget:
|
26 |
+
- text: "La matrícula del coche es 8560 JXK y el nombre del propietario es Jon Permanyer Ugartemendia, DNI 362-69-58-6n. Tel: 628539864. Calle Pasteur 46 Bajos, 08024 Barcelona"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
---
|
28 |
+
|
29 |
+
This is a Spacy multilingual (Catalan & Spanish) anonimization model, for use with BSC's AnonymizationPipeline at:
|
30 |
+
|
31 |
+
https://github.com/TeMU-BSC/AnonymizationPipeline.
|
32 |
+
|
33 |
+
pip install https://huggingface.co/PlanTL-GOB-ES/es_anonimization_core_lg/resolve/main/es_anonimization_core_lg-any-py3-none-any.whl
|
34 |
+
|
35 |
+
The anonymization pipeline is a library for performing sensitive data identification and ultimately anonymization of the detected data in Spanish and Catalan user generated plain text.
|
36 |
+
|
37 |
+
This is not a standalone model and is meant to work within the pipeline.
|
38 |
+
|
39 |
+
The model can detect the following entities: `EMAIL`, `FINANCIAL`, `ID`, `LOC`, `MISC`, `ORG`, `PER`, `TELEPHONE`, `VEHICLE`, `ZIP`
|
40 |
+
|
41 |
|
42 |
| Feature | Description |
|
43 |
| --- | --- |
|
44 |
+
| **Name** | `ca_anonimization_core_lg` |
|
45 |
| **Version** | `1.0.0` |
|
46 |
| **spaCy** | `>=3.2.3,<4.0.0` |
|
47 |
| **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
|
|
|
69 |
|
70 |
| Type | Score |
|
71 |
| --- | --- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
72 |
| `ENTS_F` | 69.12 |
|
73 |
| `ENTS_P` | 74.60 |
|
74 |
| `ENTS_R` | 64.38 |
|
75 |
+
| `NER_LOSS` | 26573.78 |
|
|