flan-t5-base-samsum / README.md
texasdave2's picture
End of training
3d23083
|
raw
history blame
No virus
4.22 kB
metadata
license: apache-2.0
base_model: google/flan-t5-base
tags:
  - generated_from_trainer
datasets:
  - samsum
metrics:
  - rouge
model-index:
  - name: flan-t5-base-samsum
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: samsum
          type: samsum
          config: samsum
          split: test
          args: samsum
        metrics:
          - name: Rouge1
            type: rouge
            value: 47.0919

flan-t5-base-samsum

This model is a fine-tuned version of google/flan-t5-base on the samsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3859
  • Rouge1: 47.0919
  • Rouge2: 23.2123
  • Rougel: 39.2407
  • Rougelsum: 43.2174
  • Gen Len: 17.3333

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.5121 0.08 50 1.4287 46.7806 22.8207 38.9302 42.7835 16.9634
1.46 0.16 100 1.4199 46.826 22.7844 39.0295 42.8573 17.2393
1.4515 0.24 150 1.4147 46.6646 22.9602 38.9391 42.8187 17.1245
1.4679 0.33 200 1.4121 46.8291 22.7922 39.1404 43.1542 17.3431
1.451 0.41 250 1.4109 46.8103 23.0066 39.2832 43.2411 17.2686
1.4434 0.49 300 1.4040 46.6321 22.989 39.3016 43.0997 16.9158
1.4417 0.57 350 1.4007 46.8538 22.9937 39.2135 43.1728 17.1172
1.4781 0.65 400 1.3952 46.8055 23.036 39.2961 43.1755 17.2076
1.4626 0.73 450 1.3940 47.0996 23.2205 39.3007 43.2286 17.2222
1.4307 0.81 500 1.3955 46.8877 23.1402 39.2634 43.1279 17.2002
1.4586 0.9 550 1.3933 46.7191 23.1291 39.2437 43.1183 17.3040
1.4465 0.98 600 1.3905 46.8651 23.29 39.2514 43.2025 17.3468
1.381 1.06 650 1.3953 46.9166 22.9547 39.0439 43.1589 17.4066
1.4125 1.14 700 1.3922 46.5286 23.0552 38.9056 42.9298 17.2381
1.3667 1.22 750 1.3922 47.3239 23.3549 39.4725 43.518 17.2930
1.3878 1.3 800 1.3953 46.6837 23.1602 39.2578 43.2195 17.3358
1.3884 1.38 850 1.3931 46.9537 23.0894 39.1676 43.1687 17.3614
1.3766 1.47 900 1.3898 46.9996 23.1407 39.2222 43.237 17.3333
1.3727 1.55 950 1.3889 46.6936 23.0454 39.0579 42.9472 17.3211
1.4001 1.63 1000 1.3859 47.0919 23.2123 39.2407 43.2174 17.3333
1.3894 1.71 1050 1.3874 47.2229 23.35 39.4333 43.4876 17.3297
1.3697 1.79 1100 1.3860 47.0872 23.3503 39.3371 43.3444 17.3504
1.3886 1.87 1150 1.3862 47.0516 23.3487 39.3653 43.3272 17.3260
1.4037 1.95 1200 1.3861 47.05 23.3672 39.3131 43.3233 17.3321

Framework versions

  • Transformers 4.33.2
  • Pytorch 2.0.0+cu117
  • Datasets 2.14.5
  • Tokenizers 0.13.3