|
--- |
|
license: apache-2.0 |
|
--- |
|
# Overview: |
|
|
|
Honyaku-7b-v2 is an improved version of its predecessor. This model exhibits enhanced accuracy in adhering to multilingual generation tags compared to the previous version. |
|
|
|
# Key Features & Limitation: |
|
|
|
* Improved Multilingual Generation Accuracy: The model has increased precision in following multilingual generation tags. |
|
* Quality-Reflective Translation: The translation quality of Honyaku-7b is strongly influenced by the pre-training of the base model. Consequently, the quality of translation varies in proportion to the training volume of the original language model. |
|
* The primary purpose is to translate about 500 to several thousand tokens. Due to the characteristics of the Base model, translation into Japanese is the most stable. |
|
* It has been fine-tuned up to 8k tokens, but based on the Base model's characteristics, it supports up to 4k tokens including the prompt. |
|
|
|
**Cautions:** |
|
|
|
In minor languages, translation does not function well. |
|
The translation function of 7b-level large language models (LLM) often contains errors. |
|
Do not use unchecked text for social communication. |
|
|
|
# 概要: |
|
Honyaku-7b-v2は、前バージョンの改良版です。このモデルは、多言語生成タグへの追従精度が前バージョンと比較して向上しています。 |
|
日本語への翻訳は前バージョンのほうが良い場合があります。 |
|
|
|
|
|
# 主な特徴と制限事項: |
|
|
|
* 多言語生成の精度向上: モデルは、多言語生成タグに対する追従の精度が向上しました。 |
|
* 翻訳品質の反映: Honyaku-7bの翻訳品質は、ベースモデルの事前学習に強く影響されます。翻訳品質は、元の言語モデルの学習量に比例して変わります。 |
|
* 500~数1000 tokenの翻訳を主目的としています。短すぎる文、長すぎる文で性能低下。 |
|
* Base modelの特徴から、日本語への翻訳が最も安定しています。 |
|
* 8k tokenまでファインチューニングしていますが、Base modelの特徴からprompt含めて4k tokenにまで対応とします。 |
|
|
|
**注意点:** |
|
|
|
* マイナーな言語においては、翻訳がうまく機能しません。 |
|
* 7bレベルの大規模言語モデル(LLM)の翻訳機能には誤りが多くみられます。未チェックの文章は、正式なコミュニケーションには使用しないでください。 |
|
|
|
# Honyaku-7b-webui |
|
|
|
|
|
``` |
|
import gradio as gr |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer |
|
from threading import Thread |
|
|
|
# 言語リスト |
|
languages = [ |
|
"English", "Chinese (Simplified)", "Chinese (Traditional)", "Spanish", "Arabic", "Hindi", |
|
"Bengali", "Portuguese", "Russian", "Japanese", "German", "French", "Urdu", "Indonesian", |
|
"Italian", "Turkish", "Korean", "Vietnamese", "Tamil", "Marathi", "Telugu", "Persian", |
|
"Polish", "Dutch", "Thai", "Gujarati", "Romanian", "Ukrainian", "Malay", "Kannada", "Oriya (Odia)", |
|
"Burmese (Myanmar)", "Azerbaijani", "Uzbek", "Kurdish (Kurmanji)", "Swedish", "Filipino (Tagalog)", |
|
"Serbian", "Czech", "Hungarian", "Greek", "Belarusian", "Bulgarian", "Hebrew", "Finnish", |
|
"Slovak", "Norwegian", "Danish", "Sinhala", "Croatian", "Lithuanian", "Slovenian", "Latvian", |
|
"Estonian", "Armenian", "Malayalam", "Georgian", "Mongolian", "Afrikaans", "Nepali", "Pashto", |
|
"Punjabi", "Kurdish", "Kyrgyz", "Somali", "Albanian", "Icelandic", "Basque", "Luxembourgish", |
|
"Macedonian", "Maltese", "Hawaiian", "Yoruba", "Maori", "Zulu", "Welsh", "Swahili", "Haitian Creole", |
|
"Lao", "Amharic", "Khmer", "Javanese", "Kazakh", "Malagasy", "Sindhi", "Sundanese", "Tajik", "Xhosa", |
|
"Yiddish", "Bosnian", "Cebuano", "Chichewa", "Corsican", "Esperanto", "Frisian", "Galician", "Hausa", |
|
"Hmong", "Igbo", "Irish", "Kinyarwanda", "Latin", "Samoan", "Scots Gaelic", "Sesotho", "Shona", |
|
"Sotho", "Swedish", "Uyghur" |
|
] |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("aixsatoshi/Honyaku-7b-v2") |
|
model = AutoModelForCausalLM.from_pretrained("aixsatoshi/Honyaku-7b-v2", torch_dtype=torch.float16) |
|
model = model.to('cuda:0') |
|
|
|
class StopOnTokens(StoppingCriteria): |
|
def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool: |
|
stop_ids = [2] |
|
for stop_id in stop_ids: |
|
if input_ids[0][-1] == stop_id: |
|
return True |
|
return False |
|
|
|
def predict(message, history, tokens, temperature, language): |
|
tag = "<" + language.lower() + ">" |
|
history_transformer_format = history + [[message, ""]] |
|
stop = StopOnTokens() |
|
|
|
messages = "".join(["".join(["\n<english>:"+item[0]+"</english>\n", tag+item[1]]) |
|
for item in history_transformer_format]) |
|
|
|
model_inputs = tokenizer([messages], return_tensors="pt").to("cuda") |
|
streamer = TextIteratorStreamer(tokenizer, timeout=10., skip_prompt=True, skip_special_tokens=True) |
|
generate_kwargs = dict( |
|
model_inputs, |
|
streamer=streamer, |
|
max_new_tokens=int(tokens), |
|
temperature=float(temperature), |
|
do_sample=True, |
|
top_p=0.95, |
|
top_k=20, |
|
repetition_penalty=1.15, |
|
num_beams=1, |
|
stopping_criteria=StoppingCriteriaList([stop]) |
|
) |
|
t = Thread(target=model.generate, kwargs=generate_kwargs) |
|
t.start() |
|
|
|
partial_message = "" |
|
for new_token in streamer: |
|
if new_token != '<': |
|
partial_message += new_token |
|
yield partial_message |
|
|
|
# Gradioインタフェースの設定 |
|
demo = gr.ChatInterface( |
|
fn=predict, |
|
title="Honyaku-7b webui", |
|
description="Translate using Honyaku-7b model", |
|
additional_inputs=[ |
|
gr.Slider(100, 4096, value=1000, label="Tokens"), |
|
gr.Slider(0.0, 1.0, value=0.3, label="Temperature"), |
|
gr.Dropdown(choices=languages, value="Japanese", label="Language") |
|
] |
|
) |
|
|
|
demo.queue().launch() |
|
``` |
|
|
|
### Textstreamer |
|
``` |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer |
|
|
|
model_name = "aixsatoshi/Honyaku-7b-v2" |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
|
# Define the streamer |
|
streamer = TextStreamer(tokenizer) |
|
|
|
# Define the English prompt |
|
english_prompt = """ |
|
Machine translation accuracy varies greatly across languages. |
|
Key challenges include context understanding, idiomatic expressions, and syntactic differences. |
|
Advanced models leverage AI to enhance translation quality, focusing on nuances and cultural relevance. |
|
|
|
To address these challenges, developers employ neural networks and deep learning techniques, which adapt to linguistic variations and learn from vast amounts of text. |
|
This approach helps in capturing the essence of languages and accurately translating complex sentences. |
|
|
|
Furthermore, user feedback plays a crucial role in refining translation algorithms. |
|
By analyzing corrections and suggestions, machine translation systems can evolve and handle nuanced expressions more effectively. |
|
This iterative process ensures continuous improvement, making translations more reliable and understandable for a global audience. |
|
""" |
|
|
|
# Prepare the prompt for English to Japanese translation |
|
prompt = f"<english>: {english_prompt} </english>\n\n<japanese>:" |
|
|
|
# Tokenize the input text and move to CUDA device |
|
inputs = tokenizer(prompt, return_tensors="pt").to("cuda") |
|
|
|
# Generate the output using the model and streamer |
|
output = model.generate(**inputs, max_new_tokens=4096, do_sample=True, top_k=20, top_p=0.95, streamer=streamer) |
|
``` |
|
|
|
### Gradio non-streaming generation |
|
``` |
|
import gradio as gr |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
languages = [ |
|
"English", "Chinese (Simplified)", "Chinese (Traditional)", "Spanish", "Arabic", "Hindi", |
|
"Bengali", "Portuguese", "Russian", "Japanese", "German", "French", "Urdu", "Indonesian", |
|
"Italian", "Turkish", "Korean", "Vietnamese", "Tamil", "Marathi", "Telugu", "Persian", |
|
"Polish", "Dutch", "Thai", "Gujarati", "Romanian", "Ukrainian", "Malay", "Kannada", "Oriya (Odia)", |
|
"Burmese (Myanmar)", "Azerbaijani", "Uzbek", "Kurdish (Kurmanji)", "Swedish", "Filipino (Tagalog)", |
|
"Serbian", "Czech", "Hungarian", "Greek", "Belarusian", "Bulgarian", "Hebrew", "Finnish", |
|
"Slovak", "Norwegian", "Danish", "Sinhala", "Croatian", "Lithuanian", "Slovenian", "Latvian", |
|
"Estonian", "Armenian", "Malayalam", "Georgian", "Mongolian", "Afrikaans", "Nepali", "Pashto", |
|
"Punjabi", "Kurdish", "Kyrgyz", "Somali", "Albanian", "Icelandic", "Basque", "Luxembourgish", |
|
"Macedonian", "Maltese", "Hawaiian", "Yoruba", "Maori", "Zulu", "Welsh", "Swahili", "Haitian Creole", |
|
"Lao", "Amharic", "Khmer", "Javanese", "Kazakh", "Malagasy", "Sindhi", "Sundanese", "Tajik", "Xhosa", |
|
"Yiddish", "Bosnian", "Cebuano", "Chichewa", "Corsican", "Esperanto", "Frisian", "Galician", "Hausa", |
|
"Hmong", "Igbo", "Irish", "Kinyarwanda", "Latin", "Samoan", "Scots Gaelic", "Sesotho", "Shona", |
|
"Sotho", "Swedish", "Uyghur" |
|
] |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("aixsatoshi/Honyaku-7b-v2") |
|
model = AutoModelForCausalLM.from_pretrained("aixsatoshi/Honyaku-7b-v2", torch_dtype=torch.float16) |
|
model = model.to('cuda:0') |
|
|
|
def predict(message, tokens, temperature, language): |
|
tag = "<" + language.lower() + ">" |
|
messages = "\n<english>:" + message + "</english>\n" + tag |
|
|
|
model_inputs = tokenizer([messages], return_tensors="pt").to("cuda") |
|
output = model.generate( |
|
**model_inputs, |
|
max_new_tokens=int(tokens), |
|
temperature=float(temperature), |
|
do_sample=True, |
|
top_p=0.95, |
|
top_k=20, |
|
repetition_penalty=1.15, |
|
num_beams=1, |
|
eos_token_id=tokenizer.eos_token_id |
|
) |
|
translation = tokenizer.decode(output[0], skip_special_tokens=True) |
|
return translation |
|
|
|
# Gradioインタフェースの設定 |
|
inputs = [ |
|
gr.Textbox(label="Message", lines=20), |
|
gr.Slider(100, 4096, value=1000, label="Tokens"), |
|
gr.Slider(0.0, 1.0, value=0.3, label="Temperature"), |
|
gr.Dropdown(choices=languages, value="Japanese", label="Language") |
|
] |
|
output = gr.Textbox(label="Translation", lines=35) |
|
|
|
demo = gr.Interface( |
|
fn=predict, |
|
inputs=inputs, |
|
outputs=output, |
|
title="Honyaku-7b webui", |
|
description="Translate using Honyaku-7b model", |
|
live=False, # 明示的にボタンをクリックして翻訳を実行する |
|
allow_flagging=False |
|
) |
|
|
|
demo.launch() |
|
``` |
|
|
|
# Base Model |
|
[tokyotech-llm/Swallow-MS-7b-v0.1](https://huggingface.co/tokyotech-llm/Swallow-MS-7b-v0.1) |
|
|
|
|