--- license: apache-2.0 tags: - text-classification - language-identification library_name: fasttext datasets: - cis-lmu/GlotSparse - cis-lmu/GlotStoryBook metrics: - f1 --- # GlotLID ## Description GlotLID is a Fasttext language identification (LID) model for around 2000 languages. ### How to use Here is how to use this model to detect the language of a given text: ```python >>> import fasttext >>> from huggingface_hub import hf_hub_download >>> model_path = hf_hub_download(repo_id="cis-lmu/glotlid", filename="model.bin") >>> model = fasttext.load_model(model_path) >>> model.predict("Hello, world!") ``` ## License The model is distributed under the Apache License, Version 2.0. ## References If you use this model, please cite the following paper: ``` @inproceedings{ kargaran2023glotlid, title={{GlotLID}: Language Identification for Low-Resource Languages}, author={Kargaran, Amir Hossein and Imani, Ayyoob and Yvon, Fran{\c{c}}ois and Sch{\"u}tze, Hinrich}, booktitle={The 2023 Conference on Empirical Methods in Natural Language Processing}, year={2023}, url={https://openreview.net/forum?id=dl4e3EBz5j} } ```