haoranxu
/

ALMA-13B

haoranxu commited on Sep 22, 2023

Commit

0226dfe

1 Parent(s): d1afbc9

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,8 +1,18 @@
 ---
 license: mit
 ---
-**ALMA** (**A**dvanced **L**anguage **M**odel-based tr**A**nslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures superior translation accuracy and performance.
 We release four translation models presented in the paper:
 - **ALMA-7B**: Full-weight Fine-tune LLaMA-2-7B on 20B monolingual tokens and then **Full-weight** fine-tune on human-written parallel data
 - **ALMA-7B-LoRA**: Full-weight Fine-tune LLaMA-2-7B on 20B monolingual tokens and then **LoRA** fine-tune on human-written parallel data

 ---
 license: mit
 ---
+**ALMA** (**A**dvanced **L**anguage **M**odel-based tr**A**nslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures strong translation performance.
+Please find more details in our [paper](https://arxiv.org/abs/2309.11674).
+```
+@misc{xu2023paradigm,
+      title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models},
+      author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
+      year={2023},
+      eprint={2309.11674},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```
 We release four translation models presented in the paper:
 - **ALMA-7B**: Full-weight Fine-tune LLaMA-2-7B on 20B monolingual tokens and then **Full-weight** fine-tune on human-written parallel data
 - **ALMA-7B-LoRA**: Full-weight Fine-tune LLaMA-2-7B on 20B monolingual tokens and then **LoRA** fine-tune on human-written parallel data