utrobinmv commited on
Commit
6956362
1 Parent(s): 2ccaa0d

feat add readme

Browse files
Files changed (1) hide show
  1. README.md +75 -0
README.md ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ru
4
+ - zh
5
+ - en
6
+ tags:
7
+ - translation
8
+ license: apache-2.0
9
+ datasets:
10
+ - ccmatrix
11
+ metrics:
12
+ - sacrebleu
13
+ ---
14
+
15
+ # T5 English, Russian and Chinese multilingual machine translation
16
+
17
+ This model represents a conventional T5 transformer in multitasking mode for translation into the required language, precisely configured for machine translation for pairs: ru-zh, zh-ru, en-zh, zh-en, en-ru, ru-en.
18
+
19
+ The model can perform direct translation between any pair of Russian, Chinese or English languages. For translation into the target language, the target language identifier is specified as a prefix 'translate to <lang>:'. In this case, the source language may not be specified, in addition, the source text may be multilingual.
20
+
21
+ Example translate Russian to Chinese
22
+
23
+ ```python
24
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
25
+
26
+ model_name = 'utrobinmv/t5_translate_en_ru_zh_large_1024'
27
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
28
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
29
+
30
+ prefix = 'translate to zh: '
31
+ src_text = prefix + "Съешь ещё этих мягких французских булок."
32
+
33
+ # translate Russian to Chinese
34
+ input_ids = tokenizer(src_text, return_tensors="pt")
35
+
36
+ generated_tokens = model.generate(**input_ids)
37
+
38
+ result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
39
+ print(result)
40
+ # 再吃这些法国的甜蜜的面包。
41
+ ```
42
+
43
+
44
+
45
+ and Example translate Chinese to Russian
46
+
47
+ ```python
48
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
49
+
50
+ model_name = 'utrobinmv/t5_translate_en_ru_zh_large_1024'
51
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
52
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
53
+
54
+ prefix = 'translate to ru: '
55
+ src_text = prefix + "再吃这些法国的甜蜜的面包。"
56
+
57
+ # translate Russian to Chinese
58
+ input_ids = tokenizer(src_text, return_tensors="pt")
59
+
60
+ generated_tokens = model.generate(**input_ids)
61
+
62
+ result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
63
+ print(result)
64
+ # Съешьте этот сладкий хлеб из Франции.
65
+ ```
66
+
67
+
68
+
69
+ ##
70
+
71
+
72
+
73
+ ## Languages covered
74
+
75
+ Russian (ru_RU), Chinese (zh_CN), English (en_US)