Spaces:

jesseplusplus
/

easy-translate

Running

App Files Files Community

kalebu commited on May 4, 2022

Commit

8930886

1 Parent(s): d0a815c

updated README.md

Browse files

Files changed (1) hide show

README.md +30 -30

README.md CHANGED Viewed

@@ -15,39 +15,39 @@
 Easy-translate is a script for translating large text files in your machine using the [M2M100 models](https://arxiv.org/pdf/2010.11125.pdf) from Facebook/Meta AI.
-M2M100 is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation.
-It was introduced in this [paper](https://arxiv.org/abs/2010.11125) and first released in [this](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) repository.
-The model that can directly translate between the 9,900 directions of 100 languages.
-Easy-Translate is built on top of 🤗HuggingFace's
-[Transformers](https://huggingface.co/docs/transformers/index) and
-🤗HuggingFace's [Accelerate](https://huggingface.co/docs/accelerate/index) library. We support:
- * CPU / GPU / multi-GPU / TPU acceleration
- * BF16 / FP16 / FB32 precision.
- * Automatic batch size finder: Forget CUDA OOM errors. Set an initial batch size, if it doesn't fit, we will automatically adjust it.
- * Sharded Data Parallel to load huge models sharded on multiple GPUs (See: https://huggingface.co/docs/accelerate/fsdp).
-Test the 🔌 Online Demo here: https://huggingface.co/spaces/Iker/Translate-100-languages
 ## Supported languages
 See the [Supported languages table](supported_languages.md) for a table of the supported languages and their ids.
-**List of supported languages:**
 Afrikaans, Amharic, Arabic, Asturian, Azerbaijani, Bashkir, Belarusian, Bulgarian, Bengali, Breton, Bosnian, Catalan, Cebuano, Czech, Welsh, Danish, German, Greeek, English, Spanish, Estonian, Persian, Fulah, Finnish, French, WesternFrisian, Irish, Gaelic, Galician, Gujarati, Hausa, Hebrew, Hindi, Croatian, Haitian, Hungarian, Armenian, Indonesian, Igbo, Iloko, Icelandic, Italian, Japanese, Javanese, Georgian, Kazakh, CentralKhmer, Kannada, Korean, Luxembourgish, Ganda, Lingala, Lao, Lithuanian, Latvian, Malagasy, Macedonian, Malayalam, Mongolian, Marathi, Malay, Burmese, Nepali, Dutch, Norwegian, NorthernSotho, Occitan, Oriya, Panjabi, Polish, Pushto, Portuguese, Romanian, Russian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Albanian, Serbian, Swati, Sundanese, Swedish, Swahili, Tamil, Thai, Tagalog, Tswana, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Wolof, Xhosa, Yiddish, Yoruba, Chinese, Zulu
 ## Supported Models
- * **Facebook/m2m100_418M**: https://huggingface.co/facebook/m2m100_418M
- * **Facebook/m2m100_1.2B**: https://huggingface.co/facebook/m2m100_1.2B
- * **Facebook/m2m100_12B**: https://huggingface.co/facebook/m2m100-12B-avg-5-ckpt
- * Any other m2m100 model from HuggingFace's Hub: https://huggingface.co/models?search=m2m100
-## Requirements:
 ```
 Pytorch >= 1.10.0
@@ -62,9 +62,10 @@ pip install --upgrade transformers
 ## Translate a file
-Run `python translate.py -h` for more info.
-#### Using a single CPU / GPU:
 ```bash
 accelerate launch translate.py \
 --sentences_path sample_text/en.txt \
@@ -74,10 +75,11 @@ accelerate launch translate.py \
 --model_name facebook/m2m100_1.2B
 ```
-#### Multi-GPU:
-See Accelerate documentation for more information (multi-node, TPU, Sharded model...): https://huggingface.co/docs/accelerate/index
-You can use the Accelerate CLI to configure the Accelerate environment (Run
-`accelerate config` in your terminal) instead of using the
 `--multi_gpu and --num_processes` flags.
 ```bash
@@ -89,15 +91,15 @@ accelerate launch --multi_gpu --num_processes 2 --num_machines 1 translate.py \
 --model_name facebook/m2m100_1.2B
 ```
-#### Automatic batch size finder:
 We will automatically find a batch size that fits in your GPU memory.
 The default initial batch size is 128 (You can set it with the `--starting_batch_size 128` flag).
 If we find an Out Of Memory error, we will automatically decrease the batch size until we find a working one.
-#### Choose precision:
-Use the `--precision` flag to choose the precision of the model. You can choose between: bf16, fp16 and 32.
 ```bash
 accelerate launch translate.py \
@@ -112,5 +114,3 @@ accelerate launch translate.py \
 ## Evaluate translations
 Work in progress...

 Easy-translate is a script for translating large text files in your machine using the [M2M100 models](https://arxiv.org/pdf/2010.11125.pdf) from Facebook/Meta AI.
+**M2M100** is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation introduced in this [paper](https://arxiv.org/abs/2010.11125) and first released in [this](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) repository.
+>The model that can directly translate between the 9,900 directions of 100 languages.
+Easy-Translate is built on top of 🤗HuggingFace's [Transformers](https://huggingface.co/docs/transformers/index) and 🤗HuggingFace's[Accelerate](https://huggingface.co/docs/accelerate/index) library.
+We currently support:
+* CPU / GPU / multi-GPU / TPU acceleration
+* BF16 / FP16 / FB32 precision.
+* Automatic batch size finder: Forget CUDA OOM errors. Set an initial batch size, if it doesn't fit, we will automatically adjust it.
+* Sharded Data Parallel to load huge models sharded on multiple GPUs (See: <https://huggingface.co/docs/accelerate/fsdp>).
+>Test the 🔌 Online Demo here: <https://huggingface.co/spaces/Iker/Translate-100-languages>
 ## Supported languages
 See the [Supported languages table](supported_languages.md) for a table of the supported languages and their ids.
+**List of supported languages:**
 Afrikaans, Amharic, Arabic, Asturian, Azerbaijani, Bashkir, Belarusian, Bulgarian, Bengali, Breton, Bosnian, Catalan, Cebuano, Czech, Welsh, Danish, German, Greeek, English, Spanish, Estonian, Persian, Fulah, Finnish, French, WesternFrisian, Irish, Gaelic, Galician, Gujarati, Hausa, Hebrew, Hindi, Croatian, Haitian, Hungarian, Armenian, Indonesian, Igbo, Iloko, Icelandic, Italian, Japanese, Javanese, Georgian, Kazakh, CentralKhmer, Kannada, Korean, Luxembourgish, Ganda, Lingala, Lao, Lithuanian, Latvian, Malagasy, Macedonian, Malayalam, Mongolian, Marathi, Malay, Burmese, Nepali, Dutch, Norwegian, NorthernSotho, Occitan, Oriya, Panjabi, Polish, Pushto, Portuguese, Romanian, Russian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Albanian, Serbian, Swati, Sundanese, Swedish, Swahili, Tamil, Thai, Tagalog, Tswana, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Wolof, Xhosa, Yiddish, Yoruba, Chinese, Zulu
 ## Supported Models
+* **Facebook/m2m100_418M**: <https://huggingface.co/facebook/m2m100_418M>
+* **Facebook/m2m100_1.2B**: <https://huggingface.co/facebook/m2m100_1.2B>
+* **Facebook/m2m100_12B**: <https://huggingface.co/facebook/m2m100-12B-avg-5-ckpt>
+* Any other m2m100 model from HuggingFace's Hub: <https://huggingface.co/models?search=m2m100>
+## Requirements
 ```
 Pytorch >= 1.10.0
 ## Translate a file
+Run `python translate.py -h` for more info.
+#### Using a single CPU / GPU
 ```bash
 accelerate launch translate.py \
 --sentences_path sample_text/en.txt \
 --model_name facebook/m2m100_1.2B
 ```
+#### Multi-GPU
+See Accelerate documentation for more information (multi-node, TPU, Sharded model...): <https://huggingface.co/docs/accelerate/index>
+You can use the Accelerate CLI to configure the Accelerate environment (Run
+`accelerate config` in your terminal) instead of using the
 `--multi_gpu and --num_processes` flags.
 ```bash
 --model_name facebook/m2m100_1.2B
 ```
+#### Automatic batch size finder
 We will automatically find a batch size that fits in your GPU memory.
 The default initial batch size is 128 (You can set it with the `--starting_batch_size 128` flag).
 If we find an Out Of Memory error, we will automatically decrease the batch size until we find a working one.
+#### Choose precision
+Use the `--precision` flag to choose the precision of the model. You can choose between: bf16, fp16 and 32.
 ```bash
 accelerate launch translate.py \
 ## Evaluate translations
 Work in progress...