jbochi's picture
End of training
33bca89
|
raw
history blame
3.33 kB
metadata
license: apache-2.0
base_model: google/flan-t5-large
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-large-spelling-peft
    results: []

flan-t5-large-spelling-peft

This model is a fine-tuned version of google/flan-t5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2537
  • Rouge1: 95.8905
  • Rouge2: 91.9178
  • Rougel: 95.8459
  • Rougelsum: 95.8393
  • Gen Len: 33.61

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.3359 0.05 500 0.2738 95.8385 91.6723 95.7821 95.766 33.5
0.2853 0.11 1000 0.2702 95.7124 91.5043 95.656 95.651 33.53
0.2691 0.16 1500 0.2691 95.735 91.7108 95.7039 95.7067 33.41
0.2596 0.21 2000 0.2663 95.9819 92.0897 95.9519 95.9488 33.51
0.2536 0.27 2500 0.2621 95.7519 91.5445 95.6614 95.6622 33.49
0.2472 0.32 3000 0.2626 95.7052 91.7321 95.6476 95.6512 33.58
0.2448 0.37 3500 0.2669 95.8003 91.7949 95.7536 95.7576 33.57
0.2345 0.43 4000 0.2582 95.8784 92.008 95.8284 95.8343 33.65
0.2345 0.48 4500 0.2629 95.8131 91.9088 95.7624 95.766 33.63
0.2284 0.53 5000 0.2585 95.8552 91.9833 95.8105 95.8135 33.62
0.2266 0.59 5500 0.2591 95.9205 92.0577 95.8689 95.8718 33.61
0.2281 0.64 6000 0.2605 95.9172 91.9782 95.874 95.8638 33.59
0.2228 0.69 6500 0.2566 95.7612 91.7858 95.7129 95.7058 33.63
0.2202 0.75 7000 0.2561 95.9468 92.0914 95.9018 95.8941 33.64
0.218 0.8 7500 0.2579 95.9468 92.0914 95.9018 95.8941 33.64
0.2162 0.85 8000 0.2523 95.8231 91.9464 95.7727 95.7758 33.66
0.2135 0.91 8500 0.2549 95.8388 91.9804 95.7914 95.7917 33.63
0.2124 0.96 9000 0.2537 95.8905 91.9178 95.8459 95.8393 33.61

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.0