Depth pruned and fine tuned Llama-3.1-8B
Collection
5 items
•
Updated
LoRA BF16,
batch_size=2,
steps=10000, gradient_accumulation_steps = 4,
warmup_steps = 5,
max_steps = 10000
learning_rate = 2e-4,
fp16 = not is_bfloat16_supported(),
bf16 = is_bfloat16_supported(),
logging_steps = 1,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "linear",
seed = 3407
[Open-Orca/SlimOrca]
MMLU Pro 0-shot: 0.2937
[TIGER-AI-Lab/MMLU-Pro]
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).