Edit model card

data

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1832

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
0.9391 0.1479 25 0.6653
0.6138 0.2959 50 0.6126
0.6039 0.4438 75 0.6061
0.5927 0.5917 100 0.5998
0.5973 0.7396 125 0.5946
0.602 0.8876 150 0.5943
0.547 1.0355 175 0.6319
0.4239 1.1834 200 0.6169
0.4301 1.3314 225 0.6158
0.4176 1.4793 250 0.6193
0.4295 1.6272 275 0.6242
0.4252 1.7751 300 0.6265
0.4252 1.9231 325 0.6264
0.3591 2.0710 350 0.6893
0.2758 2.2189 375 0.7153
0.2702 2.3669 400 0.7170
0.2797 2.5148 425 0.7173
0.2727 2.6627 450 0.7144
0.2817 2.8107 475 0.7169
0.2798 2.9586 500 0.7016
0.1922 3.1065 525 0.8090
0.16 3.2544 550 0.8373
0.1623 3.4024 575 0.8372
0.1632 3.5503 600 0.8402
0.1618 3.6982 625 0.8558
0.1732 3.8462 650 0.8581
0.1687 3.9941 675 0.8611
0.0961 4.1420 700 0.9902
0.0879 4.2899 725 1.0102
0.0899 4.4379 750 1.0345
0.0899 4.5858 775 1.0256
0.0882 4.7337 800 1.0273
0.0893 4.8817 825 1.0559
0.0824 5.0296 850 1.0753
0.052 5.1775 875 1.1582
0.052 5.3254 900 1.1643
0.0526 5.4734 925 1.1923
0.0497 5.6213 950 1.1759
0.0496 5.7692 975 1.1812
0.0477 5.9172 1000 1.1832

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.0
  • Pytorch 2.2.2
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
7.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for edpowers/data

Adapter
(889)
this model