Edit model card

results_model8_new

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 5.8002

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 30
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
6.3402 1.1141 10000 6.2499
6.0161 2.2282 20000 5.9371
5.7976 3.3422 30000 5.8868
5.5778 4.4563 40000 5.8308
5.4306 5.5704 50000 5.8110
5.3332 6.6845 60000 5.8593
5.2095 7.7986 70000 5.8150
5.1312 8.9127 80000 5.7984
5.0266 10.0267 90000 5.8415
4.9492 11.1408 100000 5.8116
4.9416 12.2549 110000 5.8231
4.8767 13.3690 120000 5.7562
4.8295 14.4831 130000 5.7837
4.7927 15.5971 140000 5.7813
4.7876 16.7112 150000 5.7653
4.7466 17.8253 160000 5.7583
4.7208 18.9394 170000 5.8002

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
3
Safetensors
Model size
63.2M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .