eng-fra

  • source language name: English

  • target language name: French

  • OPUS readme: README.md

  • model: transformer-align

  • source language code: en

  • target language code: fr

  • dataset: opus

  • release date: 2021-02-22

  • pre-processing: normalization + SentencePiece (spm32k,spm32k)

  • download original weights: opus-2021-02-22.zip

  • Training data:

    • fra-eng: Tatoeba-train (180923857)
  • Validation data:

    • eng-fra: Tatoeba-dev, 250098
    • total-size-shuffled: 249757
    • devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled!
  • Test data:

    • newsdiscussdev2015-enfr.eng-fra: 1500/27986
    • newsdiscusstest2015-enfr.eng-fra: 1500/28027
    • newssyscomb2009.eng-fra: 502/12334
    • news-test2008.eng-fra: 2051/52685
    • newstest2009.eng-fra: 2525/69278
    • newstest2010.eng-fra: 2489/66043
    • newstest2011.eng-fra: 3003/80626
    • newstest2012.eng-fra: 3003/78011
    • newstest2013.eng-fra: 3000/70037
    • Tatoeba-test.eng-fra: 10000/80769
    • tico19-test.eng-fra: 2100/64655
  • test set translations file: test.txt

  • test set scores file: eval.txt

  • BLEU-scores

    Test set score
    Tatoeba-test.eng-fra 50.8
    tico19-test.eng-fra 41.8
    newsdiscusstest2015-enfr.eng-fra 40.8
    newstest2011.eng-fra 34.6
    newsdiscussdev2015-enfr.eng-fra 33.9
    newstest2013.eng-fra 33.5
    newstest2010.eng-fra 33.0
    newstest2012.eng-fra 32.0
    newssyscomb2009.eng-fra 30.0
    newstest2009.eng-fra 29.9
    news-test2008.eng-fra 27.5
  • chr-F-scores

    Test set score
    Tatoeba-test.eng-fra 0.671
    newsdiscusstest2015-enfr.eng-fra 0.649
    tico19-test.eng-fra 0.638
    newstest2011.eng-fra 0.614
    newsdiscussdev2015-enfr.eng-fra 0.606
    newstest2010.eng-fra 0.599
    newstest2012.eng-fra 0.593
    newstest2013.eng-fra 0.591
    newssyscomb2009.eng-fra 0.587
    newstest2009.eng-fra 0.58
    news-test2008.eng-fra 0.556

System Info:

  • hf_name: eng-fra
  • source_languages: en
  • target_languages: fr
  • opus_readme_url: https://object.pouta.csc.fi/Tatoeba-MT-models/eng-fra/opus-2021-02-22.zip/README.md
  • original_repo: Tatoeba-Challenge
  • tags: ['translation']
  • languages: ['en', 'fr']
  • src_constituents: ['eng']
  • tgt_constituents: ['fra']
  • src_multilingual: False
  • tgt_multilingual: False
  • helsinki_git_sha: 6faf2dab0b7b01a0e08a114dbacbb7deac54988d
  • transformers_git_sha: e9a6c72b5edfb9561a981959b0e7c62d8ab9ef6c
  • port_machine: 146-193-182-187.edr.inesc.pt
  • port_time: 2023-11-08-11:42
Downloads last month
105
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.