--- base_model: - meta-llama/Llama-3.1-70B - allenai/Llama-3.1-Tulu-3-70B - tokyotech-llm/Llama-3.1-Swallow-70B-v0.1 library_name: transformers tags: - mergekit - merge - chat language: - ja - en pipeline_tag: text-generation license: llama3.1 --- # Llama-3.1-SuperSwallow-70B-Instruct-v0.1 > [Open Japanese LLM Leaderboard](https://huggingface.co/spaces/llm-jp/open-japanese-llm-leaderboard) 🏆 Rank1 2024/12/03 🙏 Big thank you to [@tokyotech-llm](https://huggingface.co/tokyotech-llm) and [@allenai](https://huggingface.co/allenai). ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630779c4f0dc38fb47ba6368/9pXuTcD4vNV2Lh_DZ8M5R.png) This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Test environment This model was tested using [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main). I use preset `min_p` with temperature=1 for Generation. ## Usage This format must be adhered to strictly, as deviations may result in less optimal outputs from the model. The template used to construct a prompt for the instruct model is specified as follows: ``` <|begin_of_text|><|start_header_id|>system<|end_header_id|> {SYSTEM_PROMPT}<|eot_id|><|start_header_id|>user<|end_header_id|> {USER_MESSAGE}<|eot_id|><|start_header_id|>assistant<|end_header_id|> ``` For the "{SYSTEM_PROMPT}" part, We recommend using "あなたは誠実で優秀な日本人のアシスタントです。" or "You are a helpful assistant." For the "{USER_MESSAGE}" part, We recommend using {instruction}\n{input} In other words, We recommend the following: ``` <|begin_of_text|><|start_header_id|>system<|end_header_id|> あなたは誠実で優秀な日本人のアシスタントです。<|eot_id|><|start_header_id|>user<|end_header_id|> {instruction} {input}<|eot_id|><|start_header_id|>assistant<|end_header_id|> ``` ### Use the instruct model ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "nitky/Llama-3.1-SuperSwallow-70B-Instruct-v0.1" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "Give me a short introduction to large language model." messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **model_inputs, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] ``` ## Merge Details ### Merge Method This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [meta-llama/Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B) as a base. ### Models Merged The following models were included in the merge: * [allenai/Llama-3.1-Tulu-3-70B](https://huggingface.co/allenai/Llama-3.1-Tulu-3-70B) * [tokyotech-llm/Llama-3.1-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: task_arithmetic base_model: meta-llama/Llama-3.1-70B models: - model: tokyotech-llm/Llama-3.1-Swallow-70B-v0.1 parameters: weight: 1.0 - model: allenai/Llama-3.1-Tulu-3-70B parameters: weight: 0.8 dtype: bfloat16 name: Llama-3.1-SuperSwallow-70B-Instruct-v0.1 ```