Akseltinfat commited on
Commit
0376b91
1 Parent(s): 0585988

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -35
README.md CHANGED
@@ -13,38 +13,4 @@ widget:
13
  * pre-processing: normalization + SentencePiece
14
  * test set scores: syllable: 15.95, word: 8.43
15
 
16
- ## Training
17
-
18
- Training scripts from [LalitaDeelert/NLP-ZH_TH-Project](https://github.com/LalitaDeelert/NLP-ZH_TH-Project). Experiments tracked at [cstorm125/marianmt-zh_cn-th](https://wandb.ai/cstorm125/marianmt-zh_cn-th).
19
-
20
- ```
21
- export WANDB_PROJECT=marianmt-zh_cn-th
22
- python train_model.py --input_fname ../data/v1/Train.csv \\\\\\\\\\\\\\\\
23
- \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\t--output_dir ../models/marianmt-zh_cn-th \\\\\\\\\\\\\\\\
24
- \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\t--source_lang zh --target_lang th \\\\\\\\\\\\\\\\
25
- \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\t--metric_tokenize th_syllable --fp16
26
- ```
27
-
28
- ## Usage
29
-
30
- ```
31
- from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
32
-
33
- tokenizer = AutoTokenizer.from_pretrained("Lalita/marianmt-zh_cn-th")
34
- model = AutoModelForSeq2SeqLM.from_pretrained("Lalita/marianmt-zh_cn-th").cpu()
35
-
36
- src_text = [
37
- '我爱你',
38
- '我想吃米饭',
39
- ]
40
- translated = model.generate(**tokenizer(src_text, return_tensors="pt", padding=True))
41
- print([tokenizer.decode(t, skip_special_tokens=True) for t in translated])
42
-
43
- > ['ผมรักคุณนะ', 'ฉันอยากกินข้าว']
44
- ```
45
-
46
- ## Requirements
47
- ```
48
- transformers==4.6.0
49
- torch==1.8.0
50
- ```
 
13
  * pre-processing: normalization + SentencePiece
14
  * test set scores: syllable: 15.95, word: 8.43
15
 
16
+ #