Spaces:

jesseplusplus
/

easy-translate

Running

App Files Files Community

Iker commited on Jun 23, 2023

Commit

e2b2525

1 Parent(s): 55e5d07

Add examples folder

Browse files

Files changed (17) hide show

README.md +15 -4
examples/Alpaca-Lora.sh +14 -0
examples/FlanT5-large.sh +10 -0
examples/LLaMA.sh +13 -0
examples/README.md +36 -0
examples/Vicuna.sh +16 -0
examples/m2m100-1.2B.sh +10 -0
examples/m2m100-1.2B_2GPUS.sh +10 -0
examples/m2m100-12B_8bit.sh +15 -0
examples/m2m100-12B_fp16.sh +12 -0
examples/mbart.sh +10 -0
examples/nllb200-moe-54B_1GPU_4bits.sh +16 -0
examples/nllb200-moe-54B_1GPU_8bits.sh +16 -0
examples/nllb200_3B_8bit.sh +12 -0
examples/nllb200_3B_fp16.sh +12 -0
examples/opusMT.sh +8 -0
examples/small100.sh +10 -0

README.md CHANGED Viewed

@@ -44,8 +44,9 @@ See the [Supported languages table](supported_languages.md) for a table of the s
 ## Supported Models
-💥 EasyTranslate now supports any Seq2SeqLM (m2m100, nllb200, MarianMT, T5, FlanT5, etc.) and any CausalLM (GPT2, LLaMA, Vicuna, Falcon) model from HuggingFace's Hub!!
 We still recommend you to use M2M100 or NLLB200 for the best results, but you can experiment with other LLMs and prompting to generate translations. See [Prompting Section](#prompting) for more information.
 ### M2M100
 **M2M100** is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation introduced in this [paper](https://arxiv.org/abs/2010.11125) and first released in [this](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) repository.
@@ -72,6 +73,13 @@ We still recommend you to use M2M100 or NLLB200 for the best results, but you ca
 - **facebook/nllb-200-distilled-600M**: <https://huggingface.co/facebook/nllb-200-distilled-600M>
 ## Citation
 If you use this software please cite
@@ -113,7 +121,8 @@ pip install peft
 ## Translate a file
-Run `python translate.py -h` for more info.
 #### Using a single CPU / GPU
@@ -156,11 +165,13 @@ pip install bitsandbytes
 python3 translate.py \
 --sentences_path sample_text/en.txt \
---output_path sample_text/en2es.translation.nllb-moe-54b.txt \
 --source_lang eng_Latn \
 --target_lang spa_Latn \
 --model_name facebook/nllb-moe-54b \
---precision 8
 ```
 If even the quantified model does not fit in your GPU memory, you can set the `--force_auto_device_map` flag.

 ## Supported Models
+💥 EasyTranslate now supports any Seq2SeqLM (m2m100, nllb200, small100, mbart, MarianMT, T5, FlanT5, etc.) and any CausalLM (GPT2, LLaMA, Vicuna, Falcon) model from HuggingFace's Hub!!
 We still recommend you to use M2M100 or NLLB200 for the best results, but you can experiment with other LLMs and prompting to generate translations. See [Prompting Section](#prompting) for more information.
+You can also see [the examples folder](examples) for examples of how to use EasyTranslate with different models.
 ### M2M100
 **M2M100** is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation introduced in this [paper](https://arxiv.org/abs/2010.11125) and first released in [this](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) repository.
 - **facebook/nllb-200-distilled-600M**: <https://huggingface.co/facebook/nllb-200-distilled-600M>
+### Other MT Models supported
+We support every MT model in the 🤗 Hugging Face's HUB. If you find one that doesn't work, please open an issue for us to fix it or a PR with the fix. This includes, amoung meny others:
+- **Small100**: <https://huggingface.co/alirezamsh/small100>
+- **Mbart many-to-many / many-to-one**: <https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt>
+- **Opus MT**: <Helsinki-NLP/opus-mt-es-en>
 ## Citation
 If you use this software please cite
 ## Translate a file
+Run `python translate.py -h` for more info.
+See [the examples folder](examples) for examples of how to run different models.
 #### Using a single CPU / GPU
 python3 translate.py \
 --sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.nllb200-moe-54B.txt \
 --source_lang eng_Latn \
 --target_lang spa_Latn \
 --model_name facebook/nllb-moe-54b \
+--precision 8 \
+--force_auto_device_map \
+--starting_batch_size 8
 ```
 If even the quantified model does not fit in your GPU memory, you can set the `--force_auto_device_map` flag.

examples/Alpaca-Lora.sh ADDED Viewed

	@@ -0,0 +1,14 @@

+# Run Alpaca-Lora (A LoRA model) model on sample text using prompting
+# We need to set the base model with --model_name and the LoRA weights with --lora_weights_name_or_path
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.AlpacaLora.translation.txt \
+--model_name decapoda-research/llama-7b-hf \
+--lora_weights_name_or_path tloen/alpaca-lora-7b \
+--prompt "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\nTranslate this text from English into Spanish\n\n### Input:\n%%SENTENCE%%\n\n### Response:\n" \
+--precision 8 \
+--force_auto_device_map \
+--starting_batch_size 8

examples/FlanT5-large.sh ADDED Viewed

	@@ -0,0 +1,10 @@

+# Run Flan-T5 model on sample text using promting
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.FlanT5.translation.txt \
+--model_name google/flan-t5-large \
+--prompt "Translate English to Spanish: %%SENTENCE%%" \
+--precision bf16

examples/LLaMA.sh ADDED Viewed

	@@ -0,0 +1,13 @@

+# Run LLaMA65B model on sample text using prompting
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.LLaMA.translation.txt \
+--model_name PATH_TO_LOCAL_LLAMA_WEIGHTS_IN_HF_FORMAT \
+--prompt "Translate English to Spanish: %%SENTENCE%%" \
+--precision 8 \
+--precision 4 \
+--force_auto_device_map \
+--starting_batch_size 8

examples/README.md ADDED Viewed

	@@ -0,0 +1,36 @@

+# Easy-Translate Examples
+This folder contains examples of how to use Easy-Translate with different models and configurations.
+You can adapt these examples to your own use cases. If you have any questions, please feel free to open an issue.
+### MT Models
+```bash
+m2m100-1.2B.sh
+m2m100-12B_fp16.sh
+nllb200_3B_fp16.sh
+opusMT.sh
+mbart.sh
+small100.sh
+```
+#### Multi-GPU example
+```bash
+m2m100-1.2B_2GPUs.sh
+```
+#### Running large models on customer hardware
+```bash
+m2m100-12B_8bits.sh
+nllb200_3B_8bit.sh
+nllb200-moe-54B_1GPU_int8.sh
+nllb200-moe-54B_1GPU_int4.sh
+```
+### Running LLMs with translation prompts
+```bash
+FlanT5-large.sh
+LLaMA.sh
+Vicuna.sh
+Alpaca-Lora.sh
+```

examples/Vicuna.sh ADDED Viewed

	@@ -0,0 +1,16 @@

+# Run Vicuna1.3 model on sample text using prompting
+# Different model sizes available, see https://github.com/lm-sys/FastChat#vicuna-weights:
+# lmsys/vicuna-33b-v1.3
+# lmsys/vicuna-13b-v1.3
+# lmsys/vicuna-7b-v1.3
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.Vicuna33B.translation.txt \
+--model_name lmsys/vicuna-33b-v1.3 \
+--prompt "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: %%SENTENCE%% ASSISTANT:" \
+--precision 8 \
+--force_auto_device_map \
+--starting_batch_size 8

examples/m2m100-1.2B.sh ADDED Viewed

	@@ -0,0 +1,10 @@

+# Run M2M100-1.2B model on sample text. One GPU, default precision.
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.m2m100_1.2B.txt \
+--source_lang en \
+--target_lang es \
+--model_name facebook/m2m100_1.2B

examples/m2m100-1.2B_2GPUS.sh ADDED Viewed

	@@ -0,0 +1,10 @@

+# Run M2M100-1.2B model on sample text. Multi GPU, default precision.
+cd ..
+accelerate launch --multi_gpu --num_processes 2 --num_machines 1 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.m2m100_1.2B.txt \
+--source_lang en \
+--target_lang es \
+--model_name facebook/m2m100_1.2B

examples/m2m100-12B_8bit.sh ADDED Viewed

	@@ -0,0 +1,15 @@

+# Run M2M100-1.2B model on sample text. This model requires a GPU with a lot of VRAM, so we use
+# 8-bit quantization to reduce the required VRAM so we can fit in customer grade GPUs. If you have a GPU
+# with a lot of RAM, running the model in FP16 should be faster and produce sighly better results,
+# see examples/m2m100-12B_fp16.sh
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.m2m100_12B.txt \
+--source_lang en \
+--target_lang es \
+--model_name facebook/m2m100-12B-avg-5-ckpt \
+--precision 8 \
+--starting_batch_size 8

examples/m2m100-12B_fp16.sh ADDED Viewed

	@@ -0,0 +1,12 @@

+# Run M2M100-1.2B model on sample text. We use FP16 precision, which requires a GPU with a lot of VRAM (i.e NVIDIA A100)
+# For running this model in customer grade GPUs, use 8-bit quantization, see examples/m2m100-12B_8bit.sh
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.m2m100_12B.txt \
+--source_lang en \
+--target_lang es \
+--model_name facebook/m2m100-12B-avg-5-ckpt \
+--precision fp16

examples/mbart.sh ADDED Viewed

	@@ -0,0 +1,10 @@

+# Run Mbart-many-to-many model on sample text.
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.mbart.txt \
+--source_lang en_XX \
+--target_lang es_XX \
+--model_name facebook/mbart-large-50-many-to-many-mmt

examples/nllb200-moe-54B_1GPU_4bits.sh ADDED Viewed

	@@ -0,0 +1,16 @@

+# Run NLLB200-MOE model on sample text. This is a huge model that doesn't fit on a single GPU, so we use
+# 4-bit quantization to reduce the required VRAM. Still it might not fit on a single GPU, so we also use
+# the --force_auto_device_map flag that will offload the model parameters that don't fit on the GPU to the CPU.
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.nllb200-moe-54B.txt \
+--source_lang eng_Latn \
+--target_lang spa_Latn \
+--model_name facebook/nllb-moe-54b \
+--precision 4 \
+--force_auto_device_map \
+--starting_batch_size 8

examples/nllb200-moe-54B_1GPU_8bits.sh ADDED Viewed

	@@ -0,0 +1,16 @@

+# Run NLLB200-MOE model on sample text. This is a huge model that doesn't fit on a single GPU, so we use
+# 8-bit quantization to reduce the required VRAM. Still it might not fit on a single GPU, so we also use
+# the --force_auto_device_map flag that will offload the model parameters that don't fit on the GPU to the CPU.
+# If 8-bit quantization is not enough, you can use 4-bit quantization, see examples/nllb200-moe-54B_1GPU_4bits.sh
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.nllb200-moe-54B.txt \
+--source_lang eng_Latn \
+--target_lang spa_Latn \
+--model_name facebook/nllb-moe-54b \
+--precision 8 \
+--force_auto_device_map \
+--starting_batch_size 8

examples/nllb200_3B_8bit.sh ADDED Viewed

	@@ -0,0 +1,12 @@

+# Run NLLB200-3B model on sample text. We use FP16 precision, which requires a GPU with a lot of VRAM
+# For running this model in GPUs with less VRAM, use 8-bit quantization, see examples/nllb200_3B_8bit.sh
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.nllb-200_3B.txt \
+--source_lang eng_Latn \
+--target_lang spa_Latn \
+--model_name facebook/nllb-200-3.3B \
+--precision 8

examples/nllb200_3B_fp16.sh ADDED Viewed

	@@ -0,0 +1,12 @@

+# Run NLLB200-3B model on sample text. We use FP16 precision, which requires a GPU with a lot of VRAM
+# For running this model in GPUs with less VRAM, use 8-bit quantization, see examples/nllb200_3B_8bit.sh
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.nllb-200_3B.txt \
+--source_lang eng_Latn \
+--target_lang spa_Latn \
+--model_name facebook/nllb-200-3.3B \
+--precision fp16

examples/opusMT.sh ADDED Viewed

	@@ -0,0 +1,8 @@

+# Run OpusMT model on sample text.
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.opus.txt \
+--model_name Helsinki-NLP/opus-mt-es-en

examples/small100.sh ADDED Viewed

	@@ -0,0 +1,10 @@

+# Run SMALL100 model on sample text.
+cd ..
+python3 translate.py \
+--sentences_path sample_text/en.txt \
+--output_path sample_text/en2es.translation.small100.txt \
+--source_lang en \
+--target_lang es \
+--model_name alirezamsh/small100