Edit model card

t5-summarization-base-zero-shot-headers-and-better-prompt

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9932
  • Rouge: {'rouge1': 0.4431, 'rouge2': 0.2212, 'rougeL': 0.2154, 'rougeLsum': 0.2154}
  • Bert Score: 0.884
  • Bleurt 20: -0.6941
  • Gen Len: 14.385

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge Bert Score Bleurt 20 Gen Len
2.368 1.0 601 2.0543 {'rouge1': 0.4357, 'rouge2': 0.19, 'rougeL': 0.2033, 'rougeLsum': 0.2033} 0.8765 -0.7806 14.975
2.009 2.0 1202 1.9097 {'rouge1': 0.4239, 'rouge2': 0.1934, 'rougeL': 0.2218, 'rougeLsum': 0.2218} 0.8815 -0.7203 14.46
1.7996 3.0 1803 1.8539 {'rouge1': 0.4129, 'rouge2': 0.2041, 'rougeL': 0.2144, 'rougeLsum': 0.2144} 0.8809 -0.7453 13.68
1.5575 4.0 2404 1.8461 {'rouge1': 0.4259, 'rouge2': 0.2083, 'rougeL': 0.2101, 'rougeLsum': 0.2101} 0.8839 -0.7329 14.015
1.3425 5.0 3005 1.8792 {'rouge1': 0.4138, 'rouge2': 0.2006, 'rougeL': 0.2223, 'rougeLsum': 0.2223} 0.8871 -0.7175 13.655
1.1198 6.0 3606 1.9615 {'rouge1': 0.4304, 'rouge2': 0.2097, 'rougeL': 0.2128, 'rougeLsum': 0.2128} 0.8826 -0.7148 14.105
1.0207 7.0 4207 2.0121 {'rouge1': 0.4379, 'rouge2': 0.2189, 'rougeL': 0.2228, 'rougeLsum': 0.2228} 0.8854 -0.6751 14.11
0.9197 8.0 4808 2.1143 {'rouge1': 0.4394, 'rouge2': 0.2145, 'rougeL': 0.2139, 'rougeLsum': 0.2139} 0.8852 -0.6924 14.25
0.807 9.0 5409 2.1749 {'rouge1': 0.4572, 'rouge2': 0.2265, 'rougeL': 0.2199, 'rougeLsum': 0.2199} 0.8839 -0.6967 14.465
0.7784 10.0 6010 2.2013 {'rouge1': 0.4451, 'rouge2': 0.2195, 'rougeL': 0.2152, 'rougeLsum': 0.2152} 0.8849 -0.6766 14.485
0.6285 11.0 6611 2.3428 {'rouge1': 0.4367, 'rouge2': 0.2126, 'rougeL': 0.2157, 'rougeLsum': 0.2157} 0.8846 -0.7113 14.265
0.595 12.0 7212 2.5554 {'rouge1': 0.4373, 'rouge2': 0.2161, 'rougeL': 0.2202, 'rougeLsum': 0.2202} 0.8844 -0.6867 14.36
0.611 13.0 7813 2.4775 {'rouge1': 0.4416, 'rouge2': 0.2218, 'rougeL': 0.2151, 'rougeLsum': 0.2151} 0.8833 -0.7119 14.51
0.4811 14.0 8414 2.6892 {'rouge1': 0.4412, 'rouge2': 0.2242, 'rougeL': 0.2223, 'rougeLsum': 0.2223} 0.8848 -0.6574 14.65
0.4211 15.0 9015 2.7409 {'rouge1': 0.4471, 'rouge2': 0.2165, 'rougeL': 0.2141, 'rougeLsum': 0.2141} 0.8843 -0.6566 14.655
0.4611 16.0 9616 2.8461 {'rouge1': 0.4363, 'rouge2': 0.2117, 'rougeL': 0.2101, 'rougeLsum': 0.2101} 0.8835 -0.6921 14.37
0.402 17.0 10217 2.8848 {'rouge1': 0.4505, 'rouge2': 0.2204, 'rougeL': 0.2119, 'rougeLsum': 0.2119} 0.8825 -0.6888 14.615
0.4101 18.0 10818 2.9057 {'rouge1': 0.4453, 'rouge2': 0.2216, 'rougeL': 0.2103, 'rougeLsum': 0.2103} 0.8824 -0.6881 14.505
0.3483 19.0 11419 2.9700 {'rouge1': 0.4456, 'rouge2': 0.221, 'rougeL': 0.2139, 'rougeLsum': 0.2139} 0.8836 -0.6888 14.415
0.36 20.0 12020 2.9932 {'rouge1': 0.4431, 'rouge2': 0.2212, 'rougeL': 0.2154, 'rougeLsum': 0.2154} 0.884 -0.6941 14.385

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
6
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for veronica-girolimetti/t5-summarization-base-zero-shot-headers-and-better-prompt

Finetuned
(621)
this model