Edit model card

flan-t5-ellis-way-v2

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1617

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss
1.0607 1.71 500 0.8848
0.8287 3.42 1000 0.7583
0.7292 5.13 1500 0.7125
0.734 6.84 2000 0.6789
0.6723 8.55 2500 0.6560
0.6169 10.26 3000 0.6440
0.5983 11.97 3500 0.6257
0.591 13.68 4000 0.6267
0.5581 15.39 4500 0.6153
0.5225 17.1 5000 0.6148
0.5249 18.81 5500 0.6138
0.5052 20.52 6000 0.6106
0.4441 22.23 6500 0.6065
0.455 23.94 7000 0.6107
0.4198 25.65 7500 0.6175
0.3975 27.36 8000 0.6147
0.3991 29.07 8500 0.6161
0.4088 30.78 9000 0.6215
0.373 32.49 9500 0.6229
0.3769 34.2 10000 0.6200
0.3619 35.91 10500 0.6280
0.3522 37.61 11000 0.6354
0.3412 39.32 11500 0.6396
0.3325 41.03 12000 0.6420
0.3235 42.74 12500 0.6477
0.313 44.45 13000 0.6592
0.2881 46.16 13500 0.6690
0.3132 47.87 14000 0.6713
0.2774 49.58 14500 0.6858
0.2797 51.29 15000 0.6813
0.2725 53.0 15500 0.6872
0.2561 54.71 16000 0.7032
0.244 56.42 16500 0.7145
0.2467 58.13 17000 0.7287
0.2483 59.84 17500 0.7305
0.2461 61.55 18000 0.7261
0.2297 63.26 18500 0.7400
0.2372 64.97 19000 0.7393
0.2192 66.68 19500 0.7504
0.2045 68.39 20000 0.7618
0.1823 70.1 20500 0.7855
0.2013 71.81 21000 0.7768
0.2011 73.52 21500 0.7907
0.1819 75.23 22000 0.8015
0.1805 76.94 22500 0.7997
0.1779 78.65 23000 0.8211
0.1716 80.36 23500 0.8221
0.1643 82.07 24000 0.8393
0.1586 83.78 24500 0.8258
0.1613 85.49 25000 0.8411
0.1493 87.2 25500 0.8518
0.1549 88.91 26000 0.8599
0.1479 90.62 26500 0.8742
0.1485 92.33 27000 0.8752
0.1378 94.04 27500 0.8784
0.1357 95.75 28000 0.8893
0.1395 97.46 28500 0.9146
0.1277 99.17 29000 0.9052
0.1272 100.88 29500 0.9145
0.1305 102.59 30000 0.9219
0.1212 104.3 30500 0.9261
0.1221 106.01 31000 0.9443
0.1162 107.72 31500 0.9420
0.1175 109.43 32000 0.9465
0.1092 111.13 32500 0.9682
0.1141 112.84 33000 0.9717
0.1104 114.55 33500 0.9667
0.1031 116.26 34000 0.9839
0.1037 117.97 34500 0.9852
0.1131 119.68 35000 0.9852
0.0965 121.39 35500 1.0028
0.0975 123.1 36000 1.0021
0.0968 124.81 36500 1.0105
0.0964 126.52 37000 1.0223
0.0938 128.23 37500 1.0217
0.0931 129.94 38000 1.0297
0.0861 131.65 38500 1.0355
0.0907 133.36 39000 1.0438
0.0914 135.07 39500 1.0437
0.0869 136.78 40000 1.0455
0.0842 138.49 40500 1.0635
0.0859 140.2 41000 1.0559
0.0854 141.91 41500 1.0519
0.0799 143.62 42000 1.0687
0.0805 145.33 42500 1.0685
0.0786 147.04 43000 1.0736
0.0726 148.75 43500 1.0865
0.0818 150.46 44000 1.0878
0.0793 152.17 44500 1.0877
0.0757 153.88 45000 1.0996
0.0731 155.59 45500 1.0994
0.0751 157.3 46000 1.1119
0.0729 159.01 46500 1.1097
0.0686 160.72 47000 1.0965
0.0681 162.43 47500 1.1212
0.0705 164.14 48000 1.1200
0.0712 165.85 48500 1.1177
0.0646 167.56 49000 1.1298
0.0656 169.27 49500 1.1298
0.0699 170.98 50000 1.1339
0.067 172.69 50500 1.1436
0.0647 174.4 51000 1.1394
0.0669 176.11 51500 1.1472
0.0624 177.82 52000 1.1522
0.07 179.53 52500 1.1453
0.0692 181.24 53000 1.1482
0.0633 182.95 53500 1.1533
0.0611 184.65 54000 1.1526
0.067 186.36 54500 1.1538
0.06 188.07 55000 1.1582
0.0652 189.78 55500 1.1565
0.0602 191.49 56000 1.1577
0.0622 193.2 56500 1.1606
0.0652 194.91 57000 1.1603
0.0626 196.62 57500 1.1617
0.0563 198.33 58000 1.1617

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
3
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for wayminder/flan-t5-ellis-way-v2

Finetuned
(621)
this model