Papers
arxiv:2012.13978
MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining
Published on Dec 27, 2020
Authors:
Abstract
One of the biggest challenges that prohibit the use of many current NLP methods in clinical settings is the availability of public datasets. In this work, we present MeDAL, a large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain. We pre-trained several models of common architectures on this dataset and empirically showed that such pre-training leads to improved performance and convergence speed when fine-tuning on downstream medical tasks.
Models citing this paper 0
No model linking this paper
Cite arxiv.org/abs/2012.13978 in a model README.md to link it from this page.
Datasets citing this paper 0
No dataset linking this paper
Cite arxiv.org/abs/2012.13978 in a dataset README.md to link it from this page.
Spaces citing this paper 0
No Space linking this paper
Cite arxiv.org/abs/2012.13978 in a Space README.md to link it from this page.