ccasimiro commited on
Commit
7f6592f
1 Parent(s): 623c437

add citation

Browse files
Files changed (1) hide show
  1. README.md +25 -2
README.md CHANGED
@@ -85,8 +85,31 @@ The model is ready-to-use only for masked language modelling to perform the Fill
85
  However, the is intended to be fine-tuned on downstream tasks such as Named Entity Recognition or Text Classification.
86
 
87
  ## Cite
88
- To be announced soon.
89
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
  ---
91
 
92
  ## Funding
 
85
  However, the is intended to be fine-tuned on downstream tasks such as Named Entity Recognition or Text Classification.
86
 
87
  ## Cite
88
+ If you use these models, please cite our work:
89
+
90
+ ```bibtext
91
+ @inproceedings{carrino-etal-2022-pretrained,
92
+ title = "Pretrained Biomedical Language Models for Clinical {NLP} in {S}panish",
93
+ author = "Carrino, Casimiro Pio and
94
+ Llop, Joan and
95
+ P{\`a}mies, Marc and
96
+ Guti{\'e}rrez-Fandi{\~n}o, Asier and
97
+ Armengol-Estap{\'e}, Jordi and
98
+ Silveira-Ocampo, Joaqu{\'\i}n and
99
+ Valencia, Alfonso and
100
+ Gonzalez-Agirre, Aitor and
101
+ Villegas, Marta",
102
+ booktitle = "Proceedings of the 21st Workshop on Biomedical Language Processing",
103
+ month = may,
104
+ year = "2022",
105
+ address = "Dublin, Ireland",
106
+ publisher = "Association for Computational Linguistics",
107
+ url = "https://aclanthology.org/2022.bionlp-1.19",
108
+ doi = "10.18653/v1/2022.bionlp-1.19",
109
+ pages = "193--199",
110
+ abstract = "This work presents the first large-scale biomedical Spanish language models trained from scratch, using large biomedical corpora consisting of a total of 1.1B tokens and an EHR corpus of 95M tokens. We compared them against general-domain and other domain-specific models for Spanish on three clinical NER tasks. As main results, our models are superior across the NER tasks, rendering them more convenient for clinical NLP applications. Furthermore, our findings indicate that when enough data is available, pre-training from scratch is better than continual pre-training when tested on clinical tasks, raising an exciting research question about which approach is optimal. Our models and fine-tuning scripts are publicly available at HuggingFace and GitHub.",
111
+ }
112
+ ```
113
  ---
114
 
115
  ## Funding