Llama-3.1-8B-Chat
meta-llama/Meta-Llama-3.1-8B
fine-tuned for chat completions.
Obligatory, this model was Built with Llama
.
Quick start
Simply load the model and generate responses:
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
)
model = AutoModelForCausalLM.from_pretrained("mathewhe/Llama-3.1-8B-Chat")
tokenizer = AutoTokenizer.from_pretrained("mathewhe/Llama-3.1-8B-Chat")
messages = [
{"role": "user", "content": "What is an LLM?"},
]
inputs = tokenizer.apply_chat_template(messages)
print(tokenizer.decode(model.generate(**inputs)[0]))
Alternatively, copy the included chat_class.py
module into your local
directory and just import the Chat
class:
from chat_class import Chat
chat = Chat(
"mathewhe/Llama-3.1-8B-Chat",
device="cuda",
)
# for one-off instructions
instruction = "Write an ingredient list for banana pudding."
print(chat.instruct(instruction))
# for multi-turn chat
response1 = chat.message("Hi, please explain what DNA is.")
response2 = chat.message("Tell me more about how its discovery affected society.")
# to reset the chat
chat.reset()
Performance
We verified that this model was successfully aligned for both multi-turn dialogue and one-off instruction following.
- Note that this model generates relatively short completions, leading to a low win-rate on AlpacaEval (due to the known length bias).
- But it achieves a length-corrected win-rate on-par with that of Meta's 8B instruction variant (which was trained on an unreleased dataset).
Model | AlpacaEval | AlpacaEval-LC |
---|---|---|
meta-llama/Meta-Llama-3.1-8B-Instruct | 21.84 | 20.85 |
mathewhe/Llama-3.1-8B-Chat | 12.16 | 20.53 |
Chat template
This model uses the following chat template and does not support a separate system prompt:
<|begin_of_text|>[INST]<user-message>[/INST][ASST]<llm-response>[/ASST]<|end_of_text|>
The included tokenizer will correctly format messages, so you should not have to manually format the input text.
Instead, use the tokenizer's apply_chat_template()
method on a list of
messages.
Each message should be a dict with two keys:
- "role": Either "user" or "assistant".
- "content": The message to include.
For example:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("mathewhe/Llama-3.1-8B-Chat")
messages = [
{"role": "user", "content": "Solve for x: 3x=4"},
{"role": "assistant", "content": "3x=4\n(3x)/3=(4)/3\nx=4/3"},
{"role": "user", "content": "Please explain your work."},
]
print(tokenizer.apply_chat_template(messages, tokenize=False)
outputs
<|begin_of_text|>[INST]Solve for x: 3x=4[/INST][ASST]3x=4
(3x)/3=(4)/3
x=4/3[/ASST]<|end_of_text|><|begin_of_text|>[INST]Please explain your work[/INST]
See the example code in the included chat_class.py
module for more details.
Data
This model was trained on the following three datsets:
- Downloads last month
- 21
Model tree for mathewhe/Llama-3.1-8B-Chat
Base model
meta-llama/Llama-3.1-8B