Akseltinfat commited on
Commit
0585988
1 Parent(s): cb2f944

Create README.md

Browse files

---
tags:
- translation
- torch==1.8.0
widget:
- text: "Inference Unavailable"
---
### marianmt-zgh_en
* source languages: zgh
* target languages: en
* dataset:
* model: transformer-align
* pre-processing: normalization + SentencePiece
* test set scores: syllable: 15.95, word: 8.43

Files changed (1) hide show
  1. README.md +50 -0
README.md ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - translation
4
+ - torch==1.8.0
5
+ widget:
6
+ - text: "Inference Unavailable"
7
+ ---
8
+ ### marianmt-zh_cn-th
9
+ * source languages: zh_cn
10
+ * target languages: th
11
+ * dataset:
12
+ * model: transformer-align
13
+ * pre-processing: normalization + SentencePiece
14
+ * test set scores: syllable: 15.95, word: 8.43
15
+
16
+ ## Training
17
+
18
+ Training scripts from [LalitaDeelert/NLP-ZH_TH-Project](https://github.com/LalitaDeelert/NLP-ZH_TH-Project). Experiments tracked at [cstorm125/marianmt-zh_cn-th](https://wandb.ai/cstorm125/marianmt-zh_cn-th).
19
+
20
+ ```
21
+ export WANDB_PROJECT=marianmt-zh_cn-th
22
+ python train_model.py --input_fname ../data/v1/Train.csv \\\\\\\\\\\\\\\\
23
+ \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\t--output_dir ../models/marianmt-zh_cn-th \\\\\\\\\\\\\\\\
24
+ \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\t--source_lang zh --target_lang th \\\\\\\\\\\\\\\\
25
+ \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\t--metric_tokenize th_syllable --fp16
26
+ ```
27
+
28
+ ## Usage
29
+
30
+ ```
31
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
32
+
33
+ tokenizer = AutoTokenizer.from_pretrained("Lalita/marianmt-zh_cn-th")
34
+ model = AutoModelForSeq2SeqLM.from_pretrained("Lalita/marianmt-zh_cn-th").cpu()
35
+
36
+ src_text = [
37
+ '我爱你',
38
+ '我想吃米饭',
39
+ ]
40
+ translated = model.generate(**tokenizer(src_text, return_tensors="pt", padding=True))
41
+ print([tokenizer.decode(t, skip_special_tokens=True) for t in translated])
42
+
43
+ > ['ผมรักคุณนะ', 'ฉันอยากกินข้าว']
44
+ ```
45
+
46
+ ## Requirements
47
+ ```
48
+ transformers==4.6.0
49
+ torch==1.8.0
50
+ ```