New Translation model released.

C3TR-Adapter is the QLoRA adapter for google/gemma-7b.
Despite the 4-bit quantization, the memory GPU requirement has increased to 8.1 GB. However, it is possible to run it with the free version of Colab and the performance is much improved!

webbigdata/ALMA-7B-Ja-V2-GPTQ-Ja-En

ALMA-7B-Ja-V2は日本語から英語、英語から日本語への機械翻訳を行うモデルです。
ALMA-7B-Ja-V2 is a machine translation model that uses ALMA's learning method to translate Japanese to English.

ALMA-7B-Ja-V2-GPTQ-Ja-Enは量子化、つまり多少の性能は落ちますがサイズを小さくし、実行速度を早くし、使いやすくした版です。
ALMA-7B-Ja-V2-GPTQ-Ja-En is a quantized version, i.e., it is smaller in size, faster in execution, and easier to use, with some performance loss.

サンプルコード

Googleアカウントをお持ちの方は以下のColabを使用して無料で動かす事が出来ます。
If you have a Google account, you can run it for free using the following Colab.

リンク先で「Open In Colab」ボタンを押してColabを起動してください
Click the "Open In Colab" button on the link to start Colab.

Free Colab Sample

テキストファイル全体を一気に翻訳したい方は、以下のColabをお試しください。
If you want to translate the entire file at once, try Colab below.
ALMA_7B_Ja_V2_GPTQ_Ja_En_batch_translation_sample

以下のようなエラーが発生した場合は
if you enconter error below.

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

It's mean your memory is not enough. decrease your num_beams or token size or reduce target text length.
これはメモリ不足を意味します。num_beamsかtoken size、もしくは翻訳対象の文の長さを減らしてください。

その他の版 Other Version

元のモデル ALMA-7B-Ja-V2.
original ALMA-7B-Ja-V2.

本作業について about this work

本作業はwebbigdataによって行われました
This work was done by : webbigdata.

ALMA (Advanced Language Model-based trAnslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures strong translation performance. Please find more details in their paper.

@misc{xu2023paradigm,
      title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models}, 
      author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
      year={2023},
      eprint={2309.11674},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

量子化設定　gptq 4bit/128G Quantization settings gptq 4bit/128G