t5-summarization-base-zero-shot-headers-and-better-prompt

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.9932
Rouge: {'rouge1': 0.4431, 'rouge2': 0.2212, 'rougeL': 0.2154, 'rougeLsum': 0.2154}
Bert Score: 0.884
Bleurt 20: -0.6941
Gen Len: 14.385

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge	Bert Score	Bleurt 20	Gen Len
2.368	1.0	601	2.0543	{'rouge1': 0.4357, 'rouge2': 0.19, 'rougeL': 0.2033, 'rougeLsum': 0.2033}	0.8765	-0.7806	14.975
2.009	2.0	1202	1.9097	{'rouge1': 0.4239, 'rouge2': 0.1934, 'rougeL': 0.2218, 'rougeLsum': 0.2218}	0.8815	-0.7203	14.46
1.7996	3.0	1803	1.8539	{'rouge1': 0.4129, 'rouge2': 0.2041, 'rougeL': 0.2144, 'rougeLsum': 0.2144}	0.8809	-0.7453	13.68
1.5575	4.0	2404	1.8461	{'rouge1': 0.4259, 'rouge2': 0.2083, 'rougeL': 0.2101, 'rougeLsum': 0.2101}	0.8839	-0.7329	14.015
1.3425	5.0	3005	1.8792	{'rouge1': 0.4138, 'rouge2': 0.2006, 'rougeL': 0.2223, 'rougeLsum': 0.2223}	0.8871	-0.7175	13.655
1.1198	6.0	3606	1.9615	{'rouge1': 0.4304, 'rouge2': 0.2097, 'rougeL': 0.2128, 'rougeLsum': 0.2128}	0.8826	-0.7148	14.105
1.0207	7.0	4207	2.0121	{'rouge1': 0.4379, 'rouge2': 0.2189, 'rougeL': 0.2228, 'rougeLsum': 0.2228}	0.8854	-0.6751	14.11
0.9197	8.0	4808	2.1143	{'rouge1': 0.4394, 'rouge2': 0.2145, 'rougeL': 0.2139, 'rougeLsum': 0.2139}	0.8852	-0.6924	14.25
0.807	9.0	5409	2.1749	{'rouge1': 0.4572, 'rouge2': 0.2265, 'rougeL': 0.2199, 'rougeLsum': 0.2199}	0.8839	-0.6967	14.465
0.7784	10.0	6010	2.2013	{'rouge1': 0.4451, 'rouge2': 0.2195, 'rougeL': 0.2152, 'rougeLsum': 0.2152}	0.8849	-0.6766	14.485
0.6285	11.0	6611	2.3428	{'rouge1': 0.4367, 'rouge2': 0.2126, 'rougeL': 0.2157, 'rougeLsum': 0.2157}	0.8846	-0.7113	14.265
0.595	12.0	7212	2.5554	{'rouge1': 0.4373, 'rouge2': 0.2161, 'rougeL': 0.2202, 'rougeLsum': 0.2202}	0.8844	-0.6867	14.36
0.611	13.0	7813	2.4775	{'rouge1': 0.4416, 'rouge2': 0.2218, 'rougeL': 0.2151, 'rougeLsum': 0.2151}	0.8833	-0.7119	14.51
0.4811	14.0	8414	2.6892	{'rouge1': 0.4412, 'rouge2': 0.2242, 'rougeL': 0.2223, 'rougeLsum': 0.2223}	0.8848	-0.6574	14.65
0.4211	15.0	9015	2.7409	{'rouge1': 0.4471, 'rouge2': 0.2165, 'rougeL': 0.2141, 'rougeLsum': 0.2141}	0.8843	-0.6566	14.655
0.4611	16.0	9616	2.8461	{'rouge1': 0.4363, 'rouge2': 0.2117, 'rougeL': 0.2101, 'rougeLsum': 0.2101}	0.8835	-0.6921	14.37
0.402	17.0	10217	2.8848	{'rouge1': 0.4505, 'rouge2': 0.2204, 'rougeL': 0.2119, 'rougeLsum': 0.2119}	0.8825	-0.6888	14.615
0.4101	18.0	10818	2.9057	{'rouge1': 0.4453, 'rouge2': 0.2216, 'rougeL': 0.2103, 'rougeLsum': 0.2103}	0.8824	-0.6881	14.505
0.3483	19.0	11419	2.9700	{'rouge1': 0.4456, 'rouge2': 0.221, 'rougeL': 0.2139, 'rougeLsum': 0.2139}	0.8836	-0.6888	14.415
0.36	20.0	12020	2.9932	{'rouge1': 0.4431, 'rouge2': 0.2212, 'rougeL': 0.2154, 'rougeLsum': 0.2154}	0.884	-0.6941	14.385

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu121
Datasets 2.16.1
Tokenizers 0.15.0

veronica-girolimetti
/

t5-summarization-base-zero-shot-headers-and-better-prompt

t5-summarization-base-zero-shot-headers-and-better-prompt

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for veronica-girolimetti/t5-summarization-base-zero-shot-headers-and-better-prompt

Evaluation results