lcampillos
commited on
Commit
•
95bde2d
1
Parent(s):
a0c23f2
Update README.md
Browse files
README.md
CHANGED
@@ -2,8 +2,6 @@
|
|
2 |
license: cc-by-nc-4.0
|
3 |
tags:
|
4 |
- generated_from_trainer
|
5 |
-
language:
|
6 |
-
- es
|
7 |
metrics:
|
8 |
- precision
|
9 |
- recall
|
@@ -13,9 +11,8 @@ model-index:
|
|
13 |
- name: roberta-es-clinical-trials-neg-spec
|
14 |
results: []
|
15 |
widget:
|
16 |
-
- text: "Pacientes sanos, sin ninguna enfermedad, que no tomen
|
17 |
- text: "Sujetos adultos con cáncer de próstata asintomáticos y no tratados previamente"
|
18 |
-
- text: "Enfermedades con posibles síntomas de urticaria o angioedema"
|
19 |
---
|
20 |
|
21 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -25,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
|
|
25 |
|
26 |
This named entity recognition model detects negation and speculation entities, and negated and speculated concepts:
|
27 |
- Neg_cue: negation cue (e.g. *no*, *sin*)
|
28 |
-
- Negated: negated entity or event (e.g. *sin **dolor
|
29 |
- Spec_cue: speculation cue (e.g. *posiblemente*)
|
30 |
-
- Speculated: speculated entity or event (e.g. *posiblemente **sobreviva
|
31 |
|
32 |
The model achieves the following results on the test set (when trained with the training and development set; results are averaged over 5 evaluation rounds):
|
33 |
-
- Precision: 0.
|
34 |
- Recall: 0.866 (±0.005)
|
35 |
-
- F1: 0.
|
36 |
-
- Accuracy: 0.
|
37 |
|
38 |
## Model description
|
39 |
|
40 |
This model adapts the pre-trained model [bsc-bio-ehr-es](https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es), presented in [Pio Carriño et al. (2022)](https://aclanthology.org/2022.bionlp-1.19/).
|
41 |
It is fine-tuned to conduct medical named entity recognition on Spanish texts about clinical trials.
|
42 |
-
The model is fine-tuned on the [NUBEs corpus (Lima et al. 2020)](https://aclanthology.org/2020.lrec-1.708/) and on the [CT-EBM-
|
43 |
|
44 |
## Intended uses & limitations
|
45 |
|
@@ -64,15 +61,15 @@ El propietario o creador de los modelos de ningún modo será responsable de los
|
|
64 |
|
65 |
The data used for fine-tuning are:
|
66 |
|
67 |
-
1) The [Negation and Uncertainty in Spanish Corpus (NUBes)](https://github.com/Vicomtech/NUBes-negation-uncertainty-biomedical-corpus)
|
68 |
It is a collection of 29 682 sentences (518 068 tokens) from anonymised health records in Spanish, annotated with negation and uncertainty cues and their scopes.
|
69 |
|
70 |
-
2) The [Clinical Trials for Evidence-Based-Medicine in Spanish corpus](http://www.lllf.uam.es/ESP/nlpdata/wp2/)
|
71 |
It is a collection of 1200 texts about clinical trials studies and clinical trials announcements:
|
72 |
- 500 abstracts from journals published under a Creative Commons license, e.g. available in PubMed or the Scientific Electronic Library Online (SciELO)
|
73 |
- 700 clinical trials announcements published in the European Clinical Trials Register and Repositorio Español de Estudios Clínicos
|
74 |
|
75 |
-
If you use the CT-EBM-
|
76 |
|
77 |
```
|
78 |
@article{campillosetal-midm2021,
|
@@ -100,24 +97,24 @@ The following hyperparameters were used during training:
|
|
100 |
- seed: we used different seeds for 5 evaluation rounds, and uploaded the model with the best results
|
101 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
102 |
- lr_scheduler_type: linear
|
103 |
-
- num_epochs:
|
104 |
|
105 |
|
106 |
### Training results (test set; average and standard deviation of 5 rounds with different seeds)
|
107 |
|
108 |
| Precision | Recall | F1 | Accuracy |
|
109 |
|:--------------:|:--------------:|:--------------:|:--------------:|
|
110 |
-
| 0.
|
111 |
|
112 |
|
113 |
**Results per class (test set; average and standard deviation of 5 rounds with different seeds)**
|
114 |
|
115 |
| Class | Precision | Recall | F1 | Support |
|
116 |
|:-----------:|:--------------:|:--------------:|:--------------:|:---------:|
|
117 |
-
| Neg_cue | 0.
|
118 |
-
| Negated | 0.
|
119 |
-
| Spec_cue | 0.
|
120 |
-
| Speculated | 0.
|
121 |
|
122 |
|
123 |
### Framework versions
|
|
|
2 |
license: cc-by-nc-4.0
|
3 |
tags:
|
4 |
- generated_from_trainer
|
|
|
|
|
5 |
metrics:
|
6 |
- precision
|
7 |
- recall
|
|
|
11 |
- name: roberta-es-clinical-trials-neg-spec
|
12 |
results: []
|
13 |
widget:
|
14 |
+
- text: "Pacientes sanos, sin ninguna enfermedad, que no tomen ningún medicamento"
|
15 |
- text: "Sujetos adultos con cáncer de próstata asintomáticos y no tratados previamente"
|
|
|
16 |
---
|
17 |
|
18 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
22 |
|
23 |
This named entity recognition model detects negation and speculation entities, and negated and speculated concepts:
|
24 |
- Neg_cue: negation cue (e.g. *no*, *sin*)
|
25 |
+
- Negated: negated entity or event (e.g. *sin* **dolor**)
|
26 |
- Spec_cue: speculation cue (e.g. *posiblemente*)
|
27 |
+
- Speculated: speculated entity or event (e.g. *posiblemente* **sobreviva**)
|
28 |
|
29 |
The model achieves the following results on the test set (when trained with the training and development set; results are averaged over 5 evaluation rounds):
|
30 |
+
- Precision: 0.840 (±0.003)
|
31 |
- Recall: 0.866 (±0.005)
|
32 |
+
- F1: 0.853 (±0.004)
|
33 |
+
- Accuracy: 0.985 (±0.001)
|
34 |
|
35 |
## Model description
|
36 |
|
37 |
This model adapts the pre-trained model [bsc-bio-ehr-es](https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es), presented in [Pio Carriño et al. (2022)](https://aclanthology.org/2022.bionlp-1.19/).
|
38 |
It is fine-tuned to conduct medical named entity recognition on Spanish texts about clinical trials.
|
39 |
+
The model is fine-tuned on the [NUBEs corpus (Lima et al. 2020)](https://aclanthology.org/2020.lrec-1.708/) and on the [CT-EBM-ES corpus (Campillos-Llanos et al. 2021)](https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01395-z).
|
40 |
|
41 |
## Intended uses & limitations
|
42 |
|
|
|
61 |
|
62 |
The data used for fine-tuning are:
|
63 |
|
64 |
+
1) The [Negation and Uncertainty in Spanish Corpus (NUBes)](https://github.com/Vicomtech/NUBes-negation-uncertainty-biomedical-corpus)
|
65 |
It is a collection of 29 682 sentences (518 068 tokens) from anonymised health records in Spanish, annotated with negation and uncertainty cues and their scopes.
|
66 |
|
67 |
+
2) The [Clinical Trials for Evidence-Based-Medicine in Spanish corpus](http://www.lllf.uam.es/ESP/nlpdata/wp2/).
|
68 |
It is a collection of 1200 texts about clinical trials studies and clinical trials announcements:
|
69 |
- 500 abstracts from journals published under a Creative Commons license, e.g. available in PubMed or the Scientific Electronic Library Online (SciELO)
|
70 |
- 700 clinical trials announcements published in the European Clinical Trials Register and Repositorio Español de Estudios Clínicos
|
71 |
|
72 |
+
If you use the CT-EBM-ES resource, please, cite as follows:
|
73 |
|
74 |
```
|
75 |
@article{campillosetal-midm2021,
|
|
|
97 |
- seed: we used different seeds for 5 evaluation rounds, and uploaded the model with the best results
|
98 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
99 |
- lr_scheduler_type: linear
|
100 |
+
- num_epochs: average 10.5 epochs (±1.9); trained with early stopping if no improvement after 5 epochs (early stopping patience: 5)
|
101 |
|
102 |
|
103 |
### Training results (test set; average and standard deviation of 5 rounds with different seeds)
|
104 |
|
105 |
| Precision | Recall | F1 | Accuracy |
|
106 |
|:--------------:|:--------------:|:--------------:|:--------------:|
|
107 |
+
| 0.840 (±0.003) | 0.866 (±0.005) | 0.853 (±0.004) | 0.985 (±0.001) |
|
108 |
|
109 |
|
110 |
**Results per class (test set; average and standard deviation of 5 rounds with different seeds)**
|
111 |
|
112 |
| Class | Precision | Recall | F1 | Support |
|
113 |
|:-----------:|:--------------:|:--------------:|:--------------:|:---------:|
|
114 |
+
| Neg_cue | 0.938 (±0.004) | 0.963 (±0.003) | 0.950 (±0.002) | 2436 |
|
115 |
+
| Negated | 0.799 (±0.018) | 0.843 (±0.008) | 0.820 (±0.010) | 3086 |
|
116 |
+
| Spec_cue | 0.821 (±0.021) | 0.852 (±0.015) | 0.836 (±0.008) | 749 |
|
117 |
+
| Speculated | 0.710 (±0.002) | 0.721 (±0.010) | 0.715 (±0.005) | 996 |
|
118 |
|
119 |
|
120 |
### Framework versions
|