|
--- |
|
pipeline_tag: fill-mask |
|
--- |
|
|
|
## XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models |
|
|
|
converted checkpoint of XLM-V from fairseq to huggingface |
|
|
|
## Fairseq |
|
|
|
if original model is needed, please check, model checkpoint: |
|
``` |
|
https://dl.fbaipublicfiles.com/fairseq/xlmv/xlmv.base.tar.gz |
|
``` |
|
and how to use it |
|
``` |
|
https://github.com/facebookresearch/fairseq/blob/main/examples/xlmr/README.md |
|
``` |
|
|
|
**Note: please use official checkpoints, if they will be added to transformers** (this repo is for personal usage/experiments) |
|
|
|
Citation |
|
-------- |
|
``` |
|
@misc{https://doi.org/10.48550/arxiv.2301.10472, |
|
doi = {10.48550/ARXIV.2301.10472}, |
|
url = {https://arxiv.org/abs/2301.10472}, |
|
author = {Liang, Davis and Gonen, Hila and Mao, Yuning and Hou, Rui and Goyal, Naman and Ghazvininejad, Marjan and Zettlemoyer, Luke and Khabsa, Madian}, |
|
keywords = {Computation and Language (cs.CL), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences}, |
|
title = {XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models}, |
|
publisher = {arXiv}, |
|
year = {2023}, |
|
copyright = {Creative Commons Attribution Share Alike 4.0 International} |
|
} |
|
``` |