kalebu commited on
Commit
8930886
1 Parent(s): d0a815c

updated README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -30
README.md CHANGED
@@ -15,39 +15,39 @@
15
 
16
  Easy-translate is a script for translating large text files in your machine using the [M2M100 models](https://arxiv.org/pdf/2010.11125.pdf) from Facebook/Meta AI.
17
 
18
- M2M100 is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation.
19
- It was introduced in this [paper](https://arxiv.org/abs/2010.11125) and first released in [this](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) repository.
20
- The model that can directly translate between the 9,900 directions of 100 languages.
21
 
22
- Easy-Translate is built on top of 🤗HuggingFace's
23
- [Transformers](https://huggingface.co/docs/transformers/index) and
24
- 🤗HuggingFace's [Accelerate](https://huggingface.co/docs/accelerate/index) library. We support:
25
 
26
- * CPU / GPU / multi-GPU / TPU acceleration
27
- * BF16 / FP16 / FB32 precision.
28
- * Automatic batch size finder: Forget CUDA OOM errors. Set an initial batch size, if it doesn't fit, we will automatically adjust it.
29
- * Sharded Data Parallel to load huge models sharded on multiple GPUs (See: https://huggingface.co/docs/accelerate/fsdp).
30
 
31
- Test the 🔌 Online Demo here: https://huggingface.co/spaces/Iker/Translate-100-languages
 
 
 
 
 
 
 
32
 
33
  ## Supported languages
 
34
  See the [Supported languages table](supported_languages.md) for a table of the supported languages and their ids.
35
 
36
- **List of supported languages:**
37
  Afrikaans, Amharic, Arabic, Asturian, Azerbaijani, Bashkir, Belarusian, Bulgarian, Bengali, Breton, Bosnian, Catalan, Cebuano, Czech, Welsh, Danish, German, Greeek, English, Spanish, Estonian, Persian, Fulah, Finnish, French, WesternFrisian, Irish, Gaelic, Galician, Gujarati, Hausa, Hebrew, Hindi, Croatian, Haitian, Hungarian, Armenian, Indonesian, Igbo, Iloko, Icelandic, Italian, Japanese, Javanese, Georgian, Kazakh, CentralKhmer, Kannada, Korean, Luxembourgish, Ganda, Lingala, Lao, Lithuanian, Latvian, Malagasy, Macedonian, Malayalam, Mongolian, Marathi, Malay, Burmese, Nepali, Dutch, Norwegian, NorthernSotho, Occitan, Oriya, Panjabi, Polish, Pushto, Portuguese, Romanian, Russian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Albanian, Serbian, Swati, Sundanese, Swedish, Swahili, Tamil, Thai, Tagalog, Tswana, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Wolof, Xhosa, Yiddish, Yoruba, Chinese, Zulu
38
 
39
  ## Supported Models
40
 
41
- * **Facebook/m2m100_418M**: https://huggingface.co/facebook/m2m100_418M
42
 
43
- * **Facebook/m2m100_1.2B**: https://huggingface.co/facebook/m2m100_1.2B
44
 
45
- * **Facebook/m2m100_12B**: https://huggingface.co/facebook/m2m100-12B-avg-5-ckpt
46
-
47
- * Any other m2m100 model from HuggingFace's Hub: https://huggingface.co/models?search=m2m100
48
 
 
49
 
50
- ## Requirements:
51
 
52
  ```
53
  Pytorch >= 1.10.0
@@ -62,9 +62,10 @@ pip install --upgrade transformers
62
 
63
  ## Translate a file
64
 
65
- Run `python translate.py -h` for more info.
 
 
66
 
67
- #### Using a single CPU / GPU:
68
  ```bash
69
  accelerate launch translate.py \
70
  --sentences_path sample_text/en.txt \
@@ -74,10 +75,11 @@ accelerate launch translate.py \
74
  --model_name facebook/m2m100_1.2B
75
  ```
76
 
77
- #### Multi-GPU:
78
- See Accelerate documentation for more information (multi-node, TPU, Sharded model...): https://huggingface.co/docs/accelerate/index
79
- You can use the Accelerate CLI to configure the Accelerate environment (Run
80
- `accelerate config` in your terminal) instead of using the
 
81
  `--multi_gpu and --num_processes` flags.
82
 
83
  ```bash
@@ -89,15 +91,15 @@ accelerate launch --multi_gpu --num_processes 2 --num_machines 1 translate.py \
89
  --model_name facebook/m2m100_1.2B
90
  ```
91
 
92
- #### Automatic batch size finder:
 
93
  We will automatically find a batch size that fits in your GPU memory.
94
  The default initial batch size is 128 (You can set it with the `--starting_batch_size 128` flag).
95
  If we find an Out Of Memory error, we will automatically decrease the batch size until we find a working one.
96
 
 
97
 
98
-
99
- #### Choose precision:
100
- Use the `--precision` flag to choose the precision of the model. You can choose between: bf16, fp16 and 32.
101
 
102
  ```bash
103
  accelerate launch translate.py \
@@ -112,5 +114,3 @@ accelerate launch translate.py \
112
  ## Evaluate translations
113
 
114
  Work in progress...
115
-
116
-
 
15
 
16
  Easy-translate is a script for translating large text files in your machine using the [M2M100 models](https://arxiv.org/pdf/2010.11125.pdf) from Facebook/Meta AI.
17
 
18
+ **M2M100** is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation introduced in this [paper](https://arxiv.org/abs/2010.11125) and first released in [this](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) repository.
 
 
19
 
20
+ >The model that can directly translate between the 9,900 directions of 100 languages.
 
 
21
 
22
+ Easy-Translate is built on top of 🤗HuggingFace's [Transformers](https://huggingface.co/docs/transformers/index) and 🤗HuggingFace's[Accelerate](https://huggingface.co/docs/accelerate/index) library.
 
 
 
23
 
24
+ We currently support:
25
+
26
+ * CPU / GPU / multi-GPU / TPU acceleration
27
+ * BF16 / FP16 / FB32 precision.
28
+ * Automatic batch size finder: Forget CUDA OOM errors. Set an initial batch size, if it doesn't fit, we will automatically adjust it.
29
+ * Sharded Data Parallel to load huge models sharded on multiple GPUs (See: <https://huggingface.co/docs/accelerate/fsdp>).
30
+
31
+ >Test the 🔌 Online Demo here: <https://huggingface.co/spaces/Iker/Translate-100-languages>
32
 
33
  ## Supported languages
34
+
35
  See the [Supported languages table](supported_languages.md) for a table of the supported languages and their ids.
36
 
37
+ **List of supported languages:**
38
  Afrikaans, Amharic, Arabic, Asturian, Azerbaijani, Bashkir, Belarusian, Bulgarian, Bengali, Breton, Bosnian, Catalan, Cebuano, Czech, Welsh, Danish, German, Greeek, English, Spanish, Estonian, Persian, Fulah, Finnish, French, WesternFrisian, Irish, Gaelic, Galician, Gujarati, Hausa, Hebrew, Hindi, Croatian, Haitian, Hungarian, Armenian, Indonesian, Igbo, Iloko, Icelandic, Italian, Japanese, Javanese, Georgian, Kazakh, CentralKhmer, Kannada, Korean, Luxembourgish, Ganda, Lingala, Lao, Lithuanian, Latvian, Malagasy, Macedonian, Malayalam, Mongolian, Marathi, Malay, Burmese, Nepali, Dutch, Norwegian, NorthernSotho, Occitan, Oriya, Panjabi, Polish, Pushto, Portuguese, Romanian, Russian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Albanian, Serbian, Swati, Sundanese, Swedish, Swahili, Tamil, Thai, Tagalog, Tswana, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Wolof, Xhosa, Yiddish, Yoruba, Chinese, Zulu
39
 
40
  ## Supported Models
41
 
42
+ * **Facebook/m2m100_418M**: <https://huggingface.co/facebook/m2m100_418M>
43
 
44
+ * **Facebook/m2m100_1.2B**: <https://huggingface.co/facebook/m2m100_1.2B>
45
 
46
+ * **Facebook/m2m100_12B**: <https://huggingface.co/facebook/m2m100-12B-avg-5-ckpt>
 
 
47
 
48
+ * Any other m2m100 model from HuggingFace's Hub: <https://huggingface.co/models?search=m2m100>
49
 
50
+ ## Requirements
51
 
52
  ```
53
  Pytorch >= 1.10.0
 
62
 
63
  ## Translate a file
64
 
65
+ Run `python translate.py -h` for more info.
66
+
67
+ #### Using a single CPU / GPU
68
 
 
69
  ```bash
70
  accelerate launch translate.py \
71
  --sentences_path sample_text/en.txt \
 
75
  --model_name facebook/m2m100_1.2B
76
  ```
77
 
78
+ #### Multi-GPU
79
+
80
+ See Accelerate documentation for more information (multi-node, TPU, Sharded model...): <https://huggingface.co/docs/accelerate/index>
81
+ You can use the Accelerate CLI to configure the Accelerate environment (Run
82
+ `accelerate config` in your terminal) instead of using the
83
  `--multi_gpu and --num_processes` flags.
84
 
85
  ```bash
 
91
  --model_name facebook/m2m100_1.2B
92
  ```
93
 
94
+ #### Automatic batch size finder
95
+
96
  We will automatically find a batch size that fits in your GPU memory.
97
  The default initial batch size is 128 (You can set it with the `--starting_batch_size 128` flag).
98
  If we find an Out Of Memory error, we will automatically decrease the batch size until we find a working one.
99
 
100
+ #### Choose precision
101
 
102
+ Use the `--precision` flag to choose the precision of the model. You can choose between: bf16, fp16 and 32.
 
 
103
 
104
  ```bash
105
  accelerate launch translate.py \
 
114
  ## Evaluate translations
115
 
116
  Work in progress...