Danish medical BERT
MeDa-BERT was initialized with weights from a pretrained Danish BERT model and pretrained for 48 epochs using the MLM objective on a Danish medical corpus of 123M tokens.
The development of the corpus and model is described further in this paper.
Here is an example on how to load the model in PyTorch using the 🤗Transformers library:
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("indsigt-ai/MeDa-BERT")
model = AutoModelForMaskedLM.from_pretrained("indsigt-ai/MeDa-BERT")
Citing
@inproceedings{pedersen-etal-2023-meda,
title = "{M}e{D}a-{BERT}: A medical {D}anish pretrained transformer model",
author = "Pedersen, Jannik and
Laursen, Martin and
Vinholt, Pernille and
Savarimuthu, Thiusius Rajeeth",
booktitle = "Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)",
month = may,
year = "2023",
address = "T{\'o}rshavn, Faroe Islands",
publisher = "University of Tartu Library",
url = "https://aclanthology.org/2023.nodalida-1.31",
pages = "301--307",
}
- Downloads last month
- 30
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.