marialcasimiro's picture
Test
28a294e
metadata
language:
  - fr
  - en
tags:
  - translation
license: apache-2.0

fra-eng

  • source language name: French

  • target language name: English

  • OPUS readme: README.md

  • model: transformer-align

  • source language code: fr

  • target language code: en

  • dataset: opus

  • release date: 2021-02-22

  • pre-processing: normalization + SentencePiece (spm32k,spm32k)

  • download original weights: opus-2021-02-22.zip

  • Training data:

    • fra-eng: Tatoeba-train (180923857)
  • Validation data:

    • eng-fra: Tatoeba-dev, 250098
    • total-size-shuffled: 249757
    • devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled!
  • Test data:

    • newsdiscussdev2015-enfr.fra-eng: 1500/27759
    • newsdiscusstest2015-enfr.fra-eng: 1500/26995
    • newssyscomb2009.fra-eng: 502/11821
    • news-test2008.fra-eng: 2051/49380
    • newstest2009.fra-eng: 2525/65402
    • newstest2010.fra-eng: 2489/61724
    • newstest2011.fra-eng: 3003/74681
    • newstest2012.fra-eng: 3003/72812
    • newstest2013.fra-eng: 3000/64505
    • newstest2014-fren.fra-eng: 3003/70708
    • Tatoeba-test.fra-eng: 10000/77174
  • test set translations file: test.txt

  • test set scores file: eval.txt

  • BLEU-scores

    Test set score
    Tatoeba-test.fra-eng 57.8
    newsdiscusstest2015-enfr.fra-eng 39.7
    newstest2014-fren.fra-eng 38.4
    newsdiscussdev2015-enfr.fra-eng 34.4
    newstest2013.fra-eng 34.0
    newstest2012.fra-eng 33.2
    newstest2011.fra-eng 33.1
    newstest2010.fra-eng 32.7
    newssyscomb2009.fra-eng 31.1
    newstest2009.fra-eng 30.5
    news-test2008.fra-eng 26.5
  • chr-F-scores

    Test set score
    Tatoeba-test.fra-eng 0.723
    newstest2014-fren.fra-eng 0.636
    newsdiscusstest2015-enfr.fra-eng 0.621
    newstest2011.fra-eng 0.598
    newstest2010.fra-eng 0.593
    newstest2012.fra-eng 0.593
    newstest2013.fra-eng 0.592
    newsdiscussdev2015-enfr.fra-eng 0.587
    newssyscomb2009.fra-eng 0.575
    newstest2009.fra-eng 0.572
    news-test2008.fra-eng 0.544

System Info:

  • hf_name: fra-eng
  • source_languages: fr
  • target_languages: en
  • opus_readme_url: https://object.pouta.csc.fi/Tatoeba-MT-models/fra-eng/opus-2021-02-22.zip/README.md
  • original_repo: Tatoeba-Challenge
  • tags: ['translation']
  • languages: ['fr', 'en']
  • src_constituents: ['fra']
  • tgt_constituents: ['eng']
  • src_multilingual: False
  • tgt_multilingual: False
  • helsinki_git_sha: 6faf2dab0b7b01a0e08a114dbacbb7deac54988d
  • transformers_git_sha: e9a6c72b5edfb9561a981959b0e7c62d8ab9ef6c
  • port_machine: 146-193-182-187.edr.inesc.pt
  • port_time: 2023-11-06-16:20