FoBERT / README.md
vesteinn's picture
Update README.md
8f5ad6e
metadata
license: agpl-3.0
datasets:
  - vesteinn/FC3
  - vesteinn/IC3
  - mideind/icelandic-common-crawl-corpus-IC3
  - DDSC/partial-danish-gigaword-no-twitter
  - NbAiLab/NCC
widget:
  - text: Býir vaksa <mask> enn nakað annað búøki á jørðini.
language:
  - fo

This is a Faroese language model, it was trained by adapting the ScandiBERT-no-faroese model on the FC3 corpus for 50 epochs.

If you find this model useful, please cite

@inproceedings{snaebjarnarson-etal-2023-transfer,
    title = "{T}ransfer to a Low-Resource Language via Close Relatives: The Case Study on Faroese",
    author = "Snæbjarnarson, Vésteinn  and
      Simonsen, Annika  and
      Glavaš, Goran  and
      Vulić, Ivan",
    booktitle = "Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)",
    month = "may 22--24",
    year = "2023",
    address = "Tórshavn, Faroe Islands",
    publisher = {Link{\"o}ping University Electronic Press, Sweden},
}