Edit model card

junk

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 8.1252

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 30
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
10.42 1.25 5 10.1940
10.1087 2.5 10 9.7539
9.7572 3.75 15 9.4707
9.5321 5.0 20 9.2852
9.13 6.25 25 9.1155
8.9989 7.5 30 8.9138
8.7422 8.75 35 8.7181
8.5133 10.0 40 8.5220
8.0836 11.25 45 8.3687
7.8212 12.5 50 8.2344
7.6616 13.75 55 8.1437
7.4743 15.0 60 8.0750
7.1668 16.25 65 8.0275
7.0485 17.5 70 7.9937
6.9619 18.75 75 7.9525
6.8705 20.0 80 7.9584
6.6232 21.25 85 7.9238
6.6423 22.5 90 7.9155
6.5876 23.75 95 7.9088
6.5075 25.0 100 7.9154
6.4218 26.25 105 7.8957
6.2857 27.5 110 7.9040
6.1833 28.75 115 7.9092
6.1263 30.0 120 7.9198
6.0123 31.25 125 7.9103
5.9111 32.5 130 7.9150
5.9157 33.75 135 7.9178
5.8237 35.0 140 7.9479
5.6626 36.25 145 7.9358
5.657 37.5 150 7.9548
5.5894 38.75 155 7.9572
5.5157 40.0 160 7.9800
5.4606 41.25 165 7.9481
5.2962 42.5 170 7.9568
5.2877 43.75 175 7.9720
5.2395 45.0 180 7.9709
5.1394 46.25 185 7.9900
5.0096 47.5 190 8.0010
4.9646 48.75 195 8.0105
4.973 50.0 200 8.0182
4.866 51.25 205 8.0310
4.8044 52.5 210 8.0372
4.7804 53.75 215 8.0387
4.7187 55.0 220 8.0166
4.6399 56.25 225 8.0598
4.6644 57.5 230 8.0465
4.5318 58.75 235 8.0482
4.4451 60.0 240 8.0538
4.4442 61.25 245 8.0473
4.3778 62.5 250 8.0517
4.4453 63.75 255 8.0740
4.3813 65.0 260 8.0658
4.2654 66.25 265 8.0764
4.2278 67.5 270 8.0737
4.2212 68.75 275 8.0952
4.1481 70.0 280 8.0877
4.162 71.25 285 8.0882
4.077 72.5 290 8.0813
4.0134 73.75 295 8.0862
3.9975 75.0 300 8.0980
3.9174 76.25 305 8.0989
3.9748 77.5 310 8.0903
3.9362 78.75 315 8.1109
3.8585 80.0 320 8.1049
3.8832 81.25 325 8.1076
3.8799 82.5 330 8.1078
3.8354 83.75 335 8.1073
3.8073 85.0 340 8.1182
3.8701 86.25 345 8.1179
3.7696 87.5 350 8.1204
3.7907 88.75 355 8.1187
3.7428 90.0 360 8.1172
3.7048 91.25 365 8.1201
3.724 92.5 370 8.1205
3.7308 93.75 375 8.1191
3.7665 95.0 380 8.1211
3.6804 96.25 385 8.1244
3.6001 97.5 390 8.1220
3.6411 98.75 395 8.1245
3.6321 100.0 400 8.1252

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
75.9M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .