# Llama 1B Tulu-3 Finetuned Model ## Model Description A 1B parameter Llama model fully finetuned on the Tulu-3 dataset from AllenAI. This model builds upon Meta's Llama-3.2-1B architecture and incorporates instruction-following capabilities through the Tulu-3 training mixture. ### Base Model Name: meta-llama/Llama-3.2-1B ### Dataset: allenai/tulu-3-sft-mixture ### Hardware 4x NVIDIA A100 80GB GPUs ### Training Configuration ```yaml --model_name_or_path meta-llama/Llama-3.2-1B \ --dataset_name "allenai/tulu-3-sft-mixture" \ --learning_rate 1.0e-5 \ --lr_scheduler_type linear \ --warmup_ratio 0.03 \ --weight_decay 0.0 \ --num_train_epochs 2 \ --per_device_train_batch_size 8 \ --gradient_accumulation_steps 2 \ --gradient_checkpointing \ --logging_steps 25 \ --bf16 \ --eval_strategy steps \ --eval_steps 5000 ```