Model Name

This is a multilingually fine-tuned version of NLLB based on nllb-200-distilled-600M using the text data of MuST-C v1.0 (En -> 8).

It is part of the paper Pushing the Limits of Zero-shot End-to-end Speech Translation. Details for the fine-tuning process are available at Appendix D.

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("johntsi/nllb-200-distilled-600M_mustc_en-to-8")
model = AutoModelForSeq2SeqLM.from_pretrained("johntsi/nllb-200-distilled-600M_mustc_en-to-8")

model.eval()
model.to("cuda")

text = "Translate this text to German."
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(
    **inputs,
    num_beams=5,
    forced_bos_token_id=tokenizer.lang_code_to_id["deu_Latn"]
)
translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translated_text)

Results

BLEU scores on MuST-C v1.0 tst-COMMON

Model	De	Es	Fr	It	Nl	Pt	Ro	Ru	Average
nllb-200-distilled-600M (original)	32.7	36.9	45.2	32.2	36.0	37.4	30.3	21.0	34.0
nllb-200-distilled-600M_mustc_en-to-8	34.4	38.8	44.6	34.7	39.0	41.6	32.1	22.4	35.9
nllb-200-distilled-1.3B (original)	34.6	38.6	46.8	33.7	38.2	39.6	31.8	23.2	35.8
nllb-200-distilled-1.3B_mustc_en-to-8	35.3	39.9	45.8	36.0	40.6	43.1	32.6	23.9	37.2

Citation

If you find these models useful for your research, please cite our paper :)

@inproceedings{tsiamas-etal-2024-pushing,
    title = {{Pushing the Limits of Zero-shot End-to-End Speech Translation}},
    author = "Tsiamas, Ioannis  and
      G{\'a}llego, Gerard  and
      Fonollosa, Jos{\'e}  and
      Costa-juss{\`a}, Marta",
    editor = "Ku, Lun-Wei  and
      Martins, Andre  and
      Srikumar, Vivek",
    booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand and virtual meeting",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-acl.847",
    pages = "14245--14267",
}

johntsi
/

nllb-200-distilled-600M_mustc_en-to-8

Model Name

Usage

Results

BLEU scores on MuST-C v1.0 tst-COMMON

Citation

Collection including johntsi/nllb-200-distilled-600M_mustc_en-to-8

Finetuned NLLB for ST