--- language: - fr - en tags: - translation license: apache-2.0 --- ### fra-eng * source language name: French * target language name: English * OPUS readme: [README.md](https://object.pouta.csc.fi/Tatoeba-MT-models/fra-eng/README.md) * model: transformer-align * source language code: fr * target language code: en * dataset: opus * release date: 2021-02-22 * pre-processing: normalization + SentencePiece (spm32k,spm32k) * download original weights: [opus-2021-02-22.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/fra-eng/opus-2021-02-22.zip/fra-eng/opus-2021-02-22.zip) * Training data: * fra-eng: Tatoeba-train (180923857) * Validation data: * eng-fra: Tatoeba-dev, 250098 * total-size-shuffled: 249757 * devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled! * Test data: * newsdiscussdev2015-enfr.fra-eng: 1500/27759 * newsdiscusstest2015-enfr.fra-eng: 1500/26995 * newssyscomb2009.fra-eng: 502/11821 * news-test2008.fra-eng: 2051/49380 * newstest2009.fra-eng: 2525/65402 * newstest2010.fra-eng: 2489/61724 * newstest2011.fra-eng: 3003/74681 * newstest2012.fra-eng: 3003/72812 * newstest2013.fra-eng: 3000/64505 * newstest2014-fren.fra-eng: 3003/70708 * Tatoeba-test.fra-eng: 10000/77174 * test set translations file: [test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/fra-eng/opus-2021-02-22.zip/fra-eng/opus-2021-02-22.test.txt) * test set scores file: [eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/fra-eng/opus-2021-02-22.zip/fra-eng/opus-2021-02-22.eval.txt) * BLEU-scores |Test set|score| |---|---| |Tatoeba-test.fra-eng|57.8| |newsdiscusstest2015-enfr.fra-eng|39.7| |newstest2014-fren.fra-eng|38.4| |newsdiscussdev2015-enfr.fra-eng|34.4| |newstest2013.fra-eng|34.0| |newstest2012.fra-eng|33.2| |newstest2011.fra-eng|33.1| |newstest2010.fra-eng|32.7| |newssyscomb2009.fra-eng|31.1| |newstest2009.fra-eng|30.5| |news-test2008.fra-eng|26.5| * chr-F-scores |Test set|score| |---|---| |Tatoeba-test.fra-eng|0.723| |newstest2014-fren.fra-eng|0.636| |newsdiscusstest2015-enfr.fra-eng|0.621| |newstest2011.fra-eng|0.598| |newstest2010.fra-eng|0.593| |newstest2012.fra-eng|0.593| |newstest2013.fra-eng|0.592| |newsdiscussdev2015-enfr.fra-eng|0.587| |newssyscomb2009.fra-eng|0.575| |newstest2009.fra-eng|0.572| |news-test2008.fra-eng|0.544| ### System Info: * hf_name: fra-eng * source_languages: fr * target_languages: en * opus_readme_url: https://object.pouta.csc.fi/Tatoeba-MT-models/fra-eng/opus-2021-02-22.zip/README.md * original_repo: Tatoeba-Challenge * tags: ['translation'] * languages: ['fr', 'en'] * src_constituents: ['fra'] * tgt_constituents: ['eng'] * src_multilingual: False * tgt_multilingual: False * helsinki_git_sha: 6faf2dab0b7b01a0e08a114dbacbb7deac54988d * transformers_git_sha: e9a6c72b5edfb9561a981959b0e7c62d8ab9ef6c * port_machine: 146-193-182-187.edr.inesc.pt * port_time: 2023-11-06-16:20