slone
/

nllb-pruned-6L-512d-finetuned

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

cointegrated commited on Nov 23, 2023

Commit

d65d330

•

1 Parent(s): 9b9680e

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -184,8 +184,8 @@ language:
 It is a truncated version of [NLLB-200-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) model
 (6 layers instead of 12, 512 hidden dimensions instead of 1024) with 175M parameters (131M of which are token embeddings).
-This model was fine-tuned on the [slone/nllb-200-10M-sample](https://huggingface.co/datasets/slone/nllb-200-10M-sample) subset of the NLLB dataset
-with 175 languages, using only the samples with BLASER score above 3.5.
 Because of its small size, it is really bad at translation, but can serve as a base model for further fine-tuning for a small number of languages.
 It is recommended to prune the vocabulary of this model before fine-tuning, to preserve only the tokens used with the intended languages.

 It is a truncated version of [NLLB-200-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) model
 (6 layers instead of 12, 512 hidden dimensions instead of 1024) with 175M parameters (131M of which are token embeddings).
+This model was fine-tuned on the [slone/nllb-200-10M-sample](https://huggingface.co/datasets/slone/nllb-200-10M-sample) subset of
+the [NLLB dataset](https://huggingface.co/datasets/allenai/nllb) with 175 languages, using only the samples with BLASER score above 3.5.
 Because of its small size, it is really bad at translation, but can serve as a base model for further fine-tuning for a small number of languages.
 It is recommended to prune the vocabulary of this model before fine-tuning, to preserve only the tokens used with the intended languages.