T5 English, Russian and Chinese multilingual machine translation
This model represents a conventional T5 transformer in multitasking mode for translation into the required language, precisely configured for machine translation for pairs: ru-zh, zh-ru, en-zh, zh-en, en-ru, ru-en.
The model can perform direct translation between any pair of Russian, Chinese or English languages. For translation into the target language, the target language identifier is specified as a prefix 'translate to :'. In this case, the source language may not be specified, in addition, the source text may be multilingual.
Example translate Russian to Chinese
from transformers import T5ForConditionalGeneration, T5Tokenizer
model_name = 'utrobinmv/t5_translate_en_ru_zh_small_1024'
model = T5ForConditionalGeneration.from_pretrained(model_name)
tokenizer = T5Tokenizer.from_pretrained(model_name)
prefix = 'translate to zh: '
src_text = prefix + "Цель разработки — предоставить пользователям личного синхронного переводчика."
# translate Russian to Chinese
input_ids = tokenizer(src_text, return_tensors="pt")
generated_tokens = model.generate(**input_ids)
result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
print(result)
#开发的目的是为用户提供个人同步翻译。
and Example translate Chinese to Russian
from transformers import T5ForConditionalGeneration, T5Tokenizer
model_name = 'utrobinmv/t5_translate_en_ru_zh_small_1024'
model = T5ForConditionalGeneration.from_pretrained(model_name)
tokenizer = T5Tokenizer.from_pretrained(model_name)
prefix = 'translate to ru: '
src_text = prefix + "开发的目的是为用户提供个人同步翻译。"
# translate Russian to Chinese
input_ids = tokenizer(src_text, return_tensors="pt")
generated_tokens = model.generate(**input_ids)
result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
print(result)
#Цель разработки - предоставить пользователям персональный синхронный перевод.
Languages covered
Russian (ru_RU), Chinese (zh_CN), English (en_US)
- Downloads last month
- 95
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for utrobinmv/t5_translate_en_ru_zh_base_200
Space using utrobinmv/t5_translate_en_ru_zh_base_200 1
Evaluation results
- bleu on ntrex_en-rutest set NTREX dataset Benchmark28.576
- chrf on ntrex_en-rutest set NTREX dataset Benchmark54.280
- ter on ntrex_en-rutest set NTREX dataset Benchmark62.495
- meteor on ntrex_en-rutest set NTREX dataset Benchmark0.517
- ROUGE-1 on ntrex_en-rutest set NTREX dataset Benchmark0.191
- ROUGE-2 on ntrex_en-rutest set NTREX dataset Benchmark0.066
- ROUGE-L on ntrex_en-rutest set NTREX dataset Benchmark0.190
- ROUGE-LSUM on ntrex_en-rutest set NTREX dataset Benchmark0.189
- bertscore_f1 on ntrex_en-rutest set NTREX dataset Benchmark0.855
- bertscore_precision on ntrex_en-rutest set NTREX dataset Benchmark0.858