Edit model card

Summary4500_M2_1000steps_1e8rate_SFT

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9634

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-08
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
1.9563 0.0447 50 1.9699
1.9559 0.0895 100 1.9696
1.9636 0.1342 150 1.9675
1.9608 0.1790 200 1.9666
1.9525 0.2237 250 1.9654
1.9514 0.2685 300 1.9645
1.9704 0.3132 350 1.9644
1.9596 0.3579 400 1.9639
1.9558 0.4027 450 1.9641
1.9481 0.4474 500 1.9635
1.945 0.4922 550 1.9639
1.9532 0.5369 600 1.9634
1.955 0.5817 650 1.9642
1.9589 0.6264 700 1.9635
1.9638 0.6711 750 1.9632
1.9679 0.7159 800 1.9634
1.9484 0.7606 850 1.9634
1.9593 0.8054 900 1.9634
1.9598 0.8501 950 1.9634
1.9584 0.8949 1000 1.9634

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.0.0+cu117
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
3
Safetensors
Model size
7.24B params
Tensor type
FP16
·
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Finetuned from