--- language: - en - de - multilingual license: cc-by-4.0 tags: - translation - opus-mt model-index: - name: opus-mt-eng-deu results: - task: type: translation name: Translation eng-deu dataset: name: Tatoeba-test.eng-deu type: tatoeba_mt args: eng-deu metrics: - type: bleu value: 45.8 name: BLEU --- # Opus Tatoeba English-German *This model was obtained by running the script [convert_marian_to_pytorch.py](https://github.com/huggingface/transformers/blob/master/src/transformers/models/marian/convert_marian_to_pytorch.py) - [Instruction available here](https://github.com/huggingface/transformers/tree/main/scripts/tatoeba). The original models were trained by [Jörg Tiedemann](https://blogs.helsinki.fi/tiedeman/) using the [MarianNMT](https://marian-nmt.github.io/) library. See all available `MarianMTModel` models on the profile of the [Helsinki NLP](https://huggingface.co/Helsinki-NLP) group. This is the conversion of checkpoint [opus-2021-02-22.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-deu/opus-2021-02-22.zip/eng-deu/opus-2021-02-22.zip) * --- ### eng-deu * source language name: English * target language name: German * OPUS readme: [README.md](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-deu/README.md) * model: transformer * source language code: en * target language code: de * dataset: opus * release date: 2021-02-22 * pre-processing: normalization + SentencePiece (spm32k,spm32k) * download original weights: [opus-2021-02-22.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-deu/opus-2021-02-22.zip/eng-deu/opus-2021-02-22.zip) * Training data: * deu-eng: Tatoeba-train (86845165) * Validation data: * deu-eng: Tatoeba-dev, 284809 * total-size-shuffled: 284809 * devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled! * Test data: * newssyscomb2009.eng-deu: 502/11271 * news-test2008.eng-deu: 2051/47427 * newstest2009.eng-deu: 2525/62816 * newstest2010.eng-deu: 2489/61511 * newstest2011.eng-deu: 3003/72981 * newstest2012.eng-deu: 3003/72886 * newstest2013.eng-deu: 3000/63737 * newstest2014-deen.eng-deu: 3003/62964 * newstest2015-ende.eng-deu: 2169/44260 * newstest2016-ende.eng-deu: 2999/62670 * newstest2017-ende.eng-deu: 3004/61291 * newstest2018-ende.eng-deu: 2998/64276 * newstest2019-ende.eng-deu: 1997/48969 * Tatoeba-test.eng-deu: 10000/83347 * test set translations file: [test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-deu/opus-2021-02-22.zip/eng-deu/opus-2021-02-22.test.txt) * test set scores file: [eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-deu/opus-2021-02-22.zip/eng-deu/opus-2021-02-22.eval.txt) * BLEU-scores |Test set|score| |---|---| |newstest2018-ende.eng-deu|46.4| |Tatoeba-test.eng-deu|45.8| |newstest2019-ende.eng-deu|42.4| |newstest2016-ende.eng-deu|37.9| |newstest2015-ende.eng-deu|32.0| |newstest2017-ende.eng-deu|30.6| |newstest2014-deen.eng-deu|29.6| |newstest2013.eng-deu|27.6| |newstest2010.eng-deu|25.9| |news-test2008.eng-deu|23.9| |newstest2012.eng-deu|23.8| |newssyscomb2009.eng-deu|23.3| |newstest2011.eng-deu|22.9| |newstest2009.eng-deu|22.7| * chr-F-scores |Test set|score| |---|---| |newstest2018-ende.eng-deu|0.697| |newstest2019-ende.eng-deu|0.664| |Tatoeba-test.eng-deu|0.655| |newstest2016-ende.eng-deu|0.644| |newstest2015-ende.eng-deu|0.601| |newstest2014-deen.eng-deu|0.595| |newstest2017-ende.eng-deu|0.593| |newstest2013.eng-deu|0.558| |newstest2010.eng-deu|0.55| |newssyscomb2009.eng-deu|0.539| |news-test2008.eng-deu|0.533| |newstest2009.eng-deu|0.533| |newstest2012.eng-deu|0.53| |newstest2011.eng-deu|0.528|