PEFT
Safetensors
llama
alignment-handbook
trl
sft
Generated from Trainer
Edit model card

Meta-Llama-3-8B-Instruct-mirage-all-teacher-instruct-llama-3-sft

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the nthakur/mirage-gpt-4o-sft-instruct-llama-3 and the nthakur/mirage-meta-llama-3-mistral-sft-instruct-meta-llama-tokenizer datasets. It achieves the following results on the evaluation set:

  • Loss: 0.2593

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.3535 0.0412 200 0.3586
0.4117 0.0824 400 0.3371
0.3577 0.1236 600 0.3277
0.3594 0.1649 800 0.3194
0.3603 0.2061 1000 0.3096
0.3633 0.2473 1200 0.3063
0.3078 0.2885 1400 0.3000
0.3274 0.3297 1600 0.2948
0.3474 0.3709 1800 0.2925
0.3401 0.4122 2000 0.2875
0.3124 0.4534 2200 0.2839
0.3095 0.4946 2400 0.2802
0.3532 0.5358 2600 0.2775
0.301 0.5770 2800 0.2757
0.3204 0.6182 3000 0.2712
0.3158 0.6595 3200 0.2687
0.3032 0.7007 3400 0.2667
0.2851 0.7419 3600 0.2645
0.2903 0.7831 3800 0.2629
0.2943 0.8243 4000 0.2613
0.2787 0.8655 4200 0.2603
0.2558 0.9067 4400 0.2596
0.3107 0.9480 4600 0.2593
0.2894 0.9892 4800 0.2593

Framework versions

  • PEFT 0.10.0
  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
4
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for nthakur/Meta-Llama-3-8B-Instruct-mirage-all-teacher-instruct-llama-3-sft

Adapter
(618)
this model

Datasets used to train nthakur/Meta-Llama-3-8B-Instruct-mirage-all-teacher-instruct-llama-3-sft