Alvaro8gb's picture
Update README.md
35f19ac verified
---
tags:
- spacy
- token-classification
language:
- es
license: mit
model-index:
- name: es_neg_uncert_ehr_ner
results:
- task:
name: NER
type: token-classification
metrics:
- name: NER Precision
type: precision
value: 0.8964797136
- name: NER Recall
type: recall
value: 0.8997005988
- name: NER F Score
type: f_score
value: 0.8980872684
library_name: spacy
pipeline_tag: token-classification
---
| Feature | Description |
| --- | --- |
| **Name** | `es_neg_uncert_ehr_ner` |
| **Version** | `0.0.0` |
| **spaCy** | `>=3.7.2,<3.8.0` |
| **Default Pipeline** | `transformer`, `ner` |
| **Components** | `transformer`, `ner` |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
| **Sources** | n/a |
| **License** | `mit` |
| **Author** | [Álvaro García Barragán](https://www.linkedin.com/in/%C3%A1lvaro-garc%C3%ADa-barrag%C3%A1n/) |
### Label Scheme
<details>
<summary>View label scheme (4 labels for 1 components)</summary>
| Component | Labels |
| --- | --- |
| **`ner`** | `NEG`, `NSCO`, `UNC`, `USCO` |
</details>
### Accuracy
| Type | Score |
| --- | --- |
| `ENTS_F` | 89.81 |
| `ENTS_P` | 89.65 |
| `ENTS_R` | 89.97 |
| `TRANSFORMER_LOSS` | 34598.52 |
| `NER_LOSS` | 35036.89 |
## Citation
If you use our work in your research, please cite it as follows:
```bibtex
@INPROCEEDINGS{garcia-barraganCBMS2023,
author={García-Barragán, Alvaro and Solarte-Pabón, Oswaldo and Nedostup, Georgiy and Provencio, Mariano and Menasalvas, Ernestina and Robles, Victor},
booktitle={2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS)},
title={Structuring Breast Cancer Spanish Electronic Health Records Using Deep Learning},
year={2023},
pages={404-409},
keywords={Natural Language Processing (NLP), Information extraction, Deep Learning, Breast cancer.},
doi={10.1109/CBMS58004.2023.00252}
}
```
## Installing
```
!pip install pip==22.0.2
!pip install https://huggingface.co/Alvaro8gb/es_neg_uncert_ehr_ner/resolve/main/es_neg_uncert_ehr_ner-any-py3-none-any.whl
```
## Dataset
Corpus composed of 29,682 sentences obtained from anonymised health records annotated with negation and uncertainty.
```bibtex
@article{lima2020nubes,
title={NUBes: A corpus of negation and uncertainty in Spanish clinical texts},
author={Lima, Salvador and Perez, Naiara and Cuadros, Montse and Rigau, German},
journal={arXiv preprint arXiv:2004.01092},
year={2020}
}
```