claudiatang's picture
End of training
5a0c9be
|
raw
history blame
No virus
2.39 kB
metadata
license: apache-2.0
base_model: google/flan-t5-base
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: flan-t5-base-eng-hwp
    results: []

flan-t5-base-eng-hwp

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6073
  • Bleu: 4.9074
  • Gen Len: 18.8338

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 420 1.6234 3.464 18.8
2.1287 2.0 840 1.4706 4.1392 18.8084
1.5094 3.0 1260 1.4294 4.3477 18.7924
1.2917 4.0 1680 1.4015 4.5189 18.8068
1.1347 5.0 2100 1.3900 4.697 18.8236
1.012 6.0 2520 1.4038 4.7522 18.8051
1.012 7.0 2940 1.4086 4.8399 18.8177
0.901 8.0 3360 1.4453 4.8191 18.8253
0.818 9.0 3780 1.4678 4.8245 18.8203
0.7511 10.0 4200 1.4922 4.951 18.8574
0.693 11.0 4620 1.5186 4.9174 18.8363
0.6462 12.0 5040 1.5487 5.0009 18.8338
0.6462 13.0 5460 1.5651 5.021 18.8295
0.6062 14.0 5880 1.5942 4.8801 18.8245
0.5781 15.0 6300 1.6073 4.9074 18.8338

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.14.6
  • Tokenizers 0.14.1