medspaner
/

roberta-es-clinical-trials-neg-spec-ner

@@ -2,8 +2,6 @@
 license: cc-by-nc-4.0
 tags:
 - generated_from_trainer
-language:
-- es
 metrics:
 - precision
 - recall
@@ -13,9 +11,8 @@ model-index:
 - name: roberta-es-clinical-trials-neg-spec
   results: []
 widget:
-- text: "Pacientes sanos, sin ninguna enfermedad, que no tomen medicamentos"
 - text: "Sujetos adultos con cáncer de próstata asintomáticos y no tratados previamente"
-- text: "Enfermedades con posibles síntomas de urticaria o angioedema"
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -25,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
 This named entity recognition model detects negation and speculation entities, and negated and speculated concepts:
 - Neg_cue: negation cue (e.g. *no*, *sin*)
-- Negated: negated entity or event (e.g. *sin **dolor***)
 - Spec_cue: speculation cue (e.g. *posiblemente*)
-- Speculated: speculated entity or event (e.g. *posiblemente **sobreviva***)
 The model achieves the following results on the test set (when trained with the training and development set; results are averaged over 5 evaluation rounds):
-- Precision: 0.838 (±0.003)
 - Recall: 0.866 (±0.005)
-- F1: 0.852 (±0.003)
-- Accuracy: 0.986 (±0.001)
 ## Model description
 This model adapts the pre-trained model [bsc-bio-ehr-es](https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es), presented in [Pio Carriño et al. (2022)](https://aclanthology.org/2022.bionlp-1.19/).
 It is fine-tuned to conduct medical named entity recognition on Spanish texts about clinical trials.
-The model is fine-tuned on the [NUBEs corpus (Lima et al. 2020)](https://aclanthology.org/2020.lrec-1.708/) and on the [CT-EBM-SP corpus (Campillos-Llanos et al. 2021)](https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01395-z).
 ## Intended uses & limitations
@@ -64,15 +61,15 @@ El propietario o creador de los modelos de ningún modo será responsable de los
 The data used for fine-tuning are:
-1) The [Negation and Uncertainty in Spanish Corpus (NUBes)](https://github.com/Vicomtech/NUBes-negation-uncertainty-biomedical-corpus):
 It is a collection of 29 682 sentences (518 068 tokens) from anonymised health records in Spanish, annotated with negation and uncertainty cues and their scopes.
-2) The [Clinical Trials for Evidence-Based-Medicine in Spanish corpus](http://www.lllf.uam.es/ESP/nlpdata/wp2/):
 It is a collection of 1200 texts about clinical trials studies and clinical trials announcements:
 - 500 abstracts from journals published under a Creative Commons license, e.g. available in PubMed or the Scientific Electronic Library Online (SciELO)
 - 700 clinical trials announcements published in the European Clinical Trials Register and Repositorio Español de Estudios Clínicos
-If you use the CT-EBM-SP resource, please, cite as follows:
 ```
 @article{campillosetal-midm2021,
@@ -100,24 +97,24 @@ The following hyperparameters were used during training:
 - seed: we used different seeds for 5 evaluation rounds, and uploaded the model with the best results
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 8
 ### Training results (test set; average and standard deviation of 5 rounds with different seeds)
 |   Precision    |     Recall     |       F1       |    Accuracy    |
 |:--------------:|:--------------:|:--------------:|:--------------:|
-| 0.838 (±0.003) | 0.866 (±0.005) | 0.852 (±0.003) | 0.986 (±0.001) |
 **Results per class (test set; average and standard deviation of 5 rounds with different seeds)**
 | Class       |   Precision    |     Recall     |       F1       |  Support  |
 |:-----------:|:--------------:|:--------------:|:--------------:|:---------:|
-| Neg_cue     | 0.945 (±0.004) | 0.961 (±0.002) | 0.953 (±0.003) |    2416   |
-| Negated     | 0.815 (±0.003) | 0.838 (±0.005) | 0.826 (±0.003) |    3064   |
-| Spec_cue    | 0.811 (±0.005) | 0.868 (±0.009) | 0.839 (±0.005) |     746   |
-| Speculated  | 0.685 (±0.009) | 0.719 (±0.016) | 0.701 (±0.008) |     993   |
 ### Framework versions

 license: cc-by-nc-4.0
 tags:
 - generated_from_trainer
 metrics:
 - precision
 - recall
 - name: roberta-es-clinical-trials-neg-spec
   results: []
 widget:
+- text: "Pacientes sanos, sin ninguna enfermedad, que no tomen ningún medicamento"
 - text: "Sujetos adultos con cáncer de próstata asintomáticos y no tratados previamente"
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This named entity recognition model detects negation and speculation entities, and negated and speculated concepts:
 - Neg_cue: negation cue (e.g. *no*, *sin*)
+- Negated: negated entity or event (e.g. *sin* **dolor**)
 - Spec_cue: speculation cue (e.g. *posiblemente*)
+- Speculated: speculated entity or event (e.g. *posiblemente* **sobreviva**)
 The model achieves the following results on the test set (when trained with the training and development set; results are averaged over 5 evaluation rounds):
+- Precision: 0.840 (±0.003)
 - Recall: 0.866 (±0.005)
+- F1: 0.853 (±0.004)
+- Accuracy: 0.985 (±0.001)
 ## Model description
 This model adapts the pre-trained model [bsc-bio-ehr-es](https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es), presented in [Pio Carriño et al. (2022)](https://aclanthology.org/2022.bionlp-1.19/).
 It is fine-tuned to conduct medical named entity recognition on Spanish texts about clinical trials.
+The model is fine-tuned on the [NUBEs corpus (Lima et al. 2020)](https://aclanthology.org/2020.lrec-1.708/) and on the [CT-EBM-ES corpus (Campillos-Llanos et al. 2021)](https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01395-z).
 ## Intended uses & limitations
 The data used for fine-tuning are:
+1) The [Negation and Uncertainty in Spanish Corpus (NUBes)](https://github.com/Vicomtech/NUBes-negation-uncertainty-biomedical-corpus)
 It is a collection of 29 682 sentences (518 068 tokens) from anonymised health records in Spanish, annotated with negation and uncertainty cues and their scopes.
+2) The [Clinical Trials for Evidence-Based-Medicine in Spanish corpus](http://www.lllf.uam.es/ESP/nlpdata/wp2/).
 It is a collection of 1200 texts about clinical trials studies and clinical trials announcements:
 - 500 abstracts from journals published under a Creative Commons license, e.g. available in PubMed or the Scientific Electronic Library Online (SciELO)
 - 700 clinical trials announcements published in the European Clinical Trials Register and Repositorio Español de Estudios Clínicos
+If you use the CT-EBM-ES resource, please, cite as follows:
 ```
 @article{campillosetal-midm2021,
 - seed: we used different seeds for 5 evaluation rounds, and uploaded the model with the best results
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: average 10.5 epochs (±1.9); trained with early stopping if no improvement after 5 epochs (early stopping patience: 5)
 ### Training results (test set; average and standard deviation of 5 rounds with different seeds)
 |   Precision    |     Recall     |       F1       |    Accuracy    |
 |:--------------:|:--------------:|:--------------:|:--------------:|
+| 0.840 (±0.003) | 0.866 (±0.005) | 0.853 (±0.004) | 0.985 (±0.001) |
 **Results per class (test set; average and standard deviation of 5 rounds with different seeds)**
 | Class       |   Precision    |     Recall     |       F1       |  Support  |
 |:-----------:|:--------------:|:--------------:|:--------------:|:---------:|
+| Neg_cue     | 0.938 (±0.004) | 0.963 (±0.003) | 0.950 (±0.002) |    2436   |
+| Negated     | 0.799 (±0.018) | 0.843 (±0.008) | 0.820 (±0.010) |    3086   |
+| Spec_cue    | 0.821 (±0.021) | 0.852 (±0.015) | 0.836 (±0.008) |     749   |
+| Speculated  | 0.710 (±0.002) | 0.721 (±0.010) | 0.715 (±0.005) |     996   |
 ### Framework versions