flan-t5-s / README.md
dtruong46me's picture
End of training
7dd9af7 verified
metadata
license: apache-2.0
base_model: google/flan-t5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-s
    results: []

flan-t5-s

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2736
  • Rouge1: 40.152
  • Rouge2: 15.8816
  • Rougel: 33.4399
  • Rougelsum: 35.9029
  • Gen Len: 19.886

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 6

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.347 1.0 2307 0.2918 38.3203 14.7065 31.7739 34.536 19.904
0.2765 2.0 4615 0.2817 38.9417 15.3147 32.5082 35.1789 19.884
0.2683 3.0 6922 0.2776 39.3458 15.3133 32.7661 35.2993 19.878
0.2635 4.0 9230 0.2751 39.7671 15.7051 33.1173 35.6438 19.884
0.2611 5.0 11537 0.2738 39.8607 15.5855 33.1643 35.6319 19.882
0.2592 6.0 13842 0.2736 40.152 15.8816 33.4399 35.9029 19.886

Framework versions

  • Transformers 4.36.1
  • Pytorch 2.1.2
  • Datasets 2.20.0
  • Tokenizers 0.15.2