PlanTL-GOB-ES
/

es_anonimization_core_lg

Token Classification

spaCy

Spanish

Catalan

Eval Results

Model card Files Files and versions Community

crodri commited on Jan 11, 2023

Commit

cce4b98

1 Parent(s): 3e82849

Update README.md

Browse files

Files changed (1) hide show

README.md +19 -57

README.md CHANGED Viewed

@@ -4,9 +4,10 @@ tags:
 - token-classification
 language:
 - es
 license: mit
 model-index:
-- name: es_anonimization_core_lg
   results:
   - task:
       name: NER
@@ -21,54 +22,26 @@ model-index:
     - name: NER F Score
       type: f_score
       value: 0.6911764706
-  - task:
-      name: POS
-      type: token-classification
-    metrics:
-    - name: POS (UPOS) Accuracy
-      type: accuracy
-      value: 0.0
-  - task:
-      name: MORPH
-      type: token-classification
-    metrics:
-    - name: Morph (UFeats) Accuracy
-      type: accuracy
-      value: 0.0
-  - task:
-      name: LEMMA
-      type: token-classification
-    metrics:
-    - name: Lemma Accuracy
-      type: accuracy
-      value: 0.0
-  - task:
-      name: UNLABELED_DEPENDENCIES
-      type: token-classification
-    metrics:
-    - name: Unlabeled Attachment Score (UAS)
-      type: f_score
-      value: 0.0
-  - task:
-      name: LABELED_DEPENDENCIES
-      type: token-classification
-    metrics:
-    - name: Labeled Attachment Score (LAS)
-      type: f_score
-      value: 0.0
-  - task:
-      name: SENTS
-      type: token-classification
-    metrics:
-    - name: Sentences F-Score
-      type: f_score
-      value: 0.0
 ---
-This is a Spacy multilingual anonimization model, for use with BSC's AnonymizationPipeline at https://github.com/TeMU-BSC/AnonymizationPipeline. The anonymization pipeline is a library for performing sensitive data identification and posterior anonymization of the detected data in Spanish and Catalan user generated plain text.
 | Feature | Description |
 | --- | --- |
-| **Name** | `es_anonimization_core_lg` |
 | **Version** | `1.0.0` |
 | **spaCy** | `>=3.2.3,<4.0.0` |
 | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
@@ -96,18 +69,7 @@ This is a Spacy multilingual anonimization model, for use with BSC's Anonymizati
 | Type | Score |
 | --- | --- |
-| `POS_ACC` | 0.00 |
-| `MORPH_ACC` | 0.00 |
-| `MORPH_PER_FEAT` | 0.00 |
-| `DEP_UAS` | 0.00 |
-| `DEP_LAS` | 0.00 |
-| `DEP_LAS_PER_TYPE` | 0.00 |
-| `SENTS_P` | 0.00 |
-| `SENTS_R` | 0.00 |
-| `SENTS_F` | 0.00 |
-| `LEMMA_ACC` | 0.00 |
 | `ENTS_F` | 69.12 |
 | `ENTS_P` | 74.60 |
 | `ENTS_R` | 64.38 |
-| `TOK2VEC_LOSS` | 0.00 |
-| `NER_LOSS` | 26573.78 |

 - token-classification
 language:
 - es
+- ca
 license: mit
 model-index:
+- name: ca_anonimization_core_lg
   results:
   - task:
       name: NER
     - name: NER F Score
       type: f_score
       value: 0.6911764706
+widget:
+- text: "La matrícula del coche es 8560 JXK y el nombre del propietario es Jon Permanyer Ugartemendia, DNI 362-69-58-6n. Tel:  628539864. Calle Pasteur 46 Bajos, 08024 Barcelona"
 ---
+This is a Spacy multilingual (Catalan & Spanish) anonimization model, for use with BSC's AnonymizationPipeline at:
+https://github.com/TeMU-BSC/AnonymizationPipeline.
+pip install https://huggingface.co/PlanTL-GOB-ES/es_anonimization_core_lg/resolve/main/es_anonimization_core_lg-any-py3-none-any.whl
+The anonymization pipeline is a library for performing sensitive data identification and ultimately anonymization of the detected data in Spanish and Catalan user generated plain text.
+This is not a standalone model and is meant to work within the pipeline.
+The model can detect the following entities: `EMAIL`, `FINANCIAL`, `ID`, `LOC`, `MISC`, `ORG`, `PER`, `TELEPHONE`, `VEHICLE`, `ZIP`
 | Feature | Description |
 | --- | --- |
+| **Name** | `ca_anonimization_core_lg` |
 | **Version** | `1.0.0` |
 | **spaCy** | `>=3.2.3,<4.0.0` |
 | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
 | Type | Score |
 | --- | --- |
 | `ENTS_F` | 69.12 |
 | `ENTS_P` | 74.60 |
 | `ENTS_R` | 64.38 |
+| `NER_LOSS` | 26573.78 |