|
--- |
|
tags: |
|
- spacy |
|
- token-classification |
|
language: |
|
- es |
|
license: mit |
|
model-index: |
|
- name: es_neg_uncert_ehr_ner |
|
results: |
|
- task: |
|
name: NER |
|
type: token-classification |
|
metrics: |
|
- name: NER Precision |
|
type: precision |
|
value: 0.8964797136 |
|
- name: NER Recall |
|
type: recall |
|
value: 0.8997005988 |
|
- name: NER F Score |
|
type: f_score |
|
value: 0.8980872684 |
|
library_name: spacy |
|
pipeline_tag: token-classification |
|
--- |
|
| Feature | Description | |
|
| --- | --- | |
|
| **Name** | `es_neg_uncert_ehr_ner` | |
|
| **Version** | `0.0.0` | |
|
| **spaCy** | `>=3.7.2,<3.8.0` | |
|
| **Default Pipeline** | `transformer`, `ner` | |
|
| **Components** | `transformer`, `ner` | |
|
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) | |
|
| **Sources** | n/a | |
|
| **License** | `mit` | |
|
| **Author** | [Álvaro García Barragán](https://www.linkedin.com/in/%C3%A1lvaro-garc%C3%ADa-barrag%C3%A1n/) | |
|
|
|
### Label Scheme |
|
|
|
<details> |
|
|
|
<summary>View label scheme (4 labels for 1 components)</summary> |
|
|
|
| Component | Labels | |
|
| --- | --- | |
|
| **`ner`** | `NEG`, `NSCO`, `UNC`, `USCO` | |
|
|
|
</details> |
|
|
|
### Accuracy |
|
|
|
| Type | Score | |
|
| --- | --- | |
|
| `ENTS_F` | 89.81 | |
|
| `ENTS_P` | 89.65 | |
|
| `ENTS_R` | 89.97 | |
|
| `TRANSFORMER_LOSS` | 34598.52 | |
|
| `NER_LOSS` | 35036.89 | |
|
|
|
|
|
|
|
## Citation |
|
If you use our work in your research, please cite it as follows: |
|
|
|
```bibtex |
|
@INPROCEEDINGS{garcia-barraganCBMS2023, |
|
author={García-Barragán, Alvaro and Solarte-Pabón, Oswaldo and Nedostup, Georgiy and Provencio, Mariano and Menasalvas, Ernestina and Robles, Victor}, |
|
booktitle={2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS)}, |
|
title={Structuring Breast Cancer Spanish Electronic Health Records Using Deep Learning}, |
|
year={2023}, |
|
pages={404-409}, |
|
keywords={Natural Language Processing (NLP), Information extraction, Deep Learning, Breast cancer.}, |
|
doi={10.1109/CBMS58004.2023.00252} |
|
} |
|
``` |
|
|
|
## Installing |
|
|
|
``` |
|
!pip install pip==22.0.2 |
|
!pip install https://huggingface.co/Alvaro8gb/es_neg_uncert_ehr_ner/resolve/main/es_neg_uncert_ehr_ner-any-py3-none-any.whl |
|
|
|
``` |
|
|
|
## Dataset |
|
|
|
Corpus composed of 29,682 sentences obtained from anonymised health records annotated with negation and uncertainty. |
|
|
|
```bibtex |
|
@article{lima2020nubes, |
|
title={NUBes: A corpus of negation and uncertainty in Spanish clinical texts}, |
|
author={Lima, Salvador and Perez, Naiara and Cuadros, Montse and Rigau, German}, |
|
journal={arXiv preprint arXiv:2004.01092}, |
|
year={2020} |
|
} |
|
``` |