MaziyarPanahi/calme-2.2-llama3-70b

This model is a fine-tune (DPO) of meta-llama/Meta-Llama-3-70B-Instruct model.

PS: This fine-tuned model was previously known as MaziyarPanahi/Llama-3-70B-Instruct-DPO-v0.2. It was renamed to avoid any confusion with the original model.

⚡ Quantized GGUF

All GGUF models are available here: MaziyarPanahi/calme-2.2-llama3-70b-GGUF

🏆 Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	37.98
IFEval (0-Shot)	82.08
BBH (3-Shot)	48.57
MATH Lvl 5 (4-Shot)	22.96
GPQA (0-shot)	12.19
MuSR (0-shot)	15.30
MMLU-PRO (5-shot)	46.74

Metric	Value
Avg.	78.96
AI2 Reasoning Challenge (25-Shot)	72.53
HellaSwag (10-Shot)	86.22
MMLU (5-Shot)	80.41
TruthfulQA (0-shot)	63.57
Winogrande (5-shot)	82.79
GSM8k (5-shot)	88.25

Top 10 models on the Leaderboard Llama-3-70B finet-tuned models

Prompt Template

This model uses ChatML prompt template:

<|im_start|>system
{System}
<|im_end|>
<|im_start|>user
{User}
<|im_end|>
<|im_start|>assistant
{Assistant}

How to use

You can use this model by using MaziyarPanahi/calme-2.2-llama3-70b as the model name in Hugging Face's transformers library.

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from transformers import pipeline
import torch

model_id = "MaziyarPanahi/calme-2.2-llama3-70b"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
    # attn_implementation="flash_attention_2"
)

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True
)

streamer = TextStreamer(tokenizer)

pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    model_kwargs={"torch_dtype": torch.bfloat16},
    streamer=streamer
)

# Then you can use the pipeline to generate text.

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|im_end|>"),
    tokenizer.convert_tokens_to_ids("<|eot_id|>") # safer to have this too
]

outputs = pipeline(
    prompt,
    max_new_tokens=2048,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.95,
)
print(outputs[0]["generated_text"][len(prompt):])

Downloads last month: 116

Safetensors

Model size

70.6B params

Tensor type

BF16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Model tree for MaziyarPanahi/calme-2.2-llama3-70b

Base model

meta-llama/Meta-Llama-3-70B

Finetuned

meta-llama/Meta-Llama-3-70B-Instruct

Finetuned

(43)

this model

Quantizations

1 model

Dataset used to train MaziyarPanahi/calme-2.2-llama3-70b

Collections including MaziyarPanahi/calme-2.2-llama3-70b

👑 Llama-3

Collection

My experiments with Llama-3 models • 61 items • Updated Feb 4 • 22

🇫🇷 Calme-2

Collection

New Calme-2 fine-tuned models • 30 items • Updated Feb 4 • 4

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

72.530
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

86.220
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

80.410
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

63.570
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

82.790
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

88.250
strict accuracy on IFEval (0-Shot)
Open LLM Leaderboard

82.080
normalized accuracy on BBH (3-Shot)
Open LLM Leaderboard

48.570
exact match on MATH Lvl 5 (4-Shot)
Open LLM Leaderboard

22.960
acc_norm on GPQA (0-shot)
Open LLM Leaderboard

12.190

View on Papers With Code