--- license: apache-2.0 base_model: google/flan-t5-base tags: - generated_from_trainer metrics: - rouge model-index: - name: t5-summarization-base-zero-shot-headers-and-better-prompt results: [] --- # t5-summarization-base-zero-shot-headers-and-better-prompt This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 2.9932 - Rouge: {'rouge1': 0.4431, 'rouge2': 0.2212, 'rougeL': 0.2154, 'rougeLsum': 0.2154} - Bert Score: 0.884 - Bleurt 20: -0.6941 - Gen Len: 14.385 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 20 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge | Bert Score | Bleurt 20 | Gen Len | |:-------------:|:-----:|:-----:|:---------------:|:---------------------------------------------------------------------------:|:----------:|:---------:|:-------:| | 2.368 | 1.0 | 601 | 2.0543 | {'rouge1': 0.4357, 'rouge2': 0.19, 'rougeL': 0.2033, 'rougeLsum': 0.2033} | 0.8765 | -0.7806 | 14.975 | | 2.009 | 2.0 | 1202 | 1.9097 | {'rouge1': 0.4239, 'rouge2': 0.1934, 'rougeL': 0.2218, 'rougeLsum': 0.2218} | 0.8815 | -0.7203 | 14.46 | | 1.7996 | 3.0 | 1803 | 1.8539 | {'rouge1': 0.4129, 'rouge2': 0.2041, 'rougeL': 0.2144, 'rougeLsum': 0.2144} | 0.8809 | -0.7453 | 13.68 | | 1.5575 | 4.0 | 2404 | 1.8461 | {'rouge1': 0.4259, 'rouge2': 0.2083, 'rougeL': 0.2101, 'rougeLsum': 0.2101} | 0.8839 | -0.7329 | 14.015 | | 1.3425 | 5.0 | 3005 | 1.8792 | {'rouge1': 0.4138, 'rouge2': 0.2006, 'rougeL': 0.2223, 'rougeLsum': 0.2223} | 0.8871 | -0.7175 | 13.655 | | 1.1198 | 6.0 | 3606 | 1.9615 | {'rouge1': 0.4304, 'rouge2': 0.2097, 'rougeL': 0.2128, 'rougeLsum': 0.2128} | 0.8826 | -0.7148 | 14.105 | | 1.0207 | 7.0 | 4207 | 2.0121 | {'rouge1': 0.4379, 'rouge2': 0.2189, 'rougeL': 0.2228, 'rougeLsum': 0.2228} | 0.8854 | -0.6751 | 14.11 | | 0.9197 | 8.0 | 4808 | 2.1143 | {'rouge1': 0.4394, 'rouge2': 0.2145, 'rougeL': 0.2139, 'rougeLsum': 0.2139} | 0.8852 | -0.6924 | 14.25 | | 0.807 | 9.0 | 5409 | 2.1749 | {'rouge1': 0.4572, 'rouge2': 0.2265, 'rougeL': 0.2199, 'rougeLsum': 0.2199} | 0.8839 | -0.6967 | 14.465 | | 0.7784 | 10.0 | 6010 | 2.2013 | {'rouge1': 0.4451, 'rouge2': 0.2195, 'rougeL': 0.2152, 'rougeLsum': 0.2152} | 0.8849 | -0.6766 | 14.485 | | 0.6285 | 11.0 | 6611 | 2.3428 | {'rouge1': 0.4367, 'rouge2': 0.2126, 'rougeL': 0.2157, 'rougeLsum': 0.2157} | 0.8846 | -0.7113 | 14.265 | | 0.595 | 12.0 | 7212 | 2.5554 | {'rouge1': 0.4373, 'rouge2': 0.2161, 'rougeL': 0.2202, 'rougeLsum': 0.2202} | 0.8844 | -0.6867 | 14.36 | | 0.611 | 13.0 | 7813 | 2.4775 | {'rouge1': 0.4416, 'rouge2': 0.2218, 'rougeL': 0.2151, 'rougeLsum': 0.2151} | 0.8833 | -0.7119 | 14.51 | | 0.4811 | 14.0 | 8414 | 2.6892 | {'rouge1': 0.4412, 'rouge2': 0.2242, 'rougeL': 0.2223, 'rougeLsum': 0.2223} | 0.8848 | -0.6574 | 14.65 | | 0.4211 | 15.0 | 9015 | 2.7409 | {'rouge1': 0.4471, 'rouge2': 0.2165, 'rougeL': 0.2141, 'rougeLsum': 0.2141} | 0.8843 | -0.6566 | 14.655 | | 0.4611 | 16.0 | 9616 | 2.8461 | {'rouge1': 0.4363, 'rouge2': 0.2117, 'rougeL': 0.2101, 'rougeLsum': 0.2101} | 0.8835 | -0.6921 | 14.37 | | 0.402 | 17.0 | 10217 | 2.8848 | {'rouge1': 0.4505, 'rouge2': 0.2204, 'rougeL': 0.2119, 'rougeLsum': 0.2119} | 0.8825 | -0.6888 | 14.615 | | 0.4101 | 18.0 | 10818 | 2.9057 | {'rouge1': 0.4453, 'rouge2': 0.2216, 'rougeL': 0.2103, 'rougeLsum': 0.2103} | 0.8824 | -0.6881 | 14.505 | | 0.3483 | 19.0 | 11419 | 2.9700 | {'rouge1': 0.4456, 'rouge2': 0.221, 'rougeL': 0.2139, 'rougeLsum': 0.2139} | 0.8836 | -0.6888 | 14.415 | | 0.36 | 20.0 | 12020 | 2.9932 | {'rouge1': 0.4431, 'rouge2': 0.2212, 'rougeL': 0.2154, 'rougeLsum': 0.2154} | 0.884 | -0.6941 | 14.385 | ### Framework versions - Transformers 4.35.2 - Pytorch 2.1.0+cu121 - Datasets 2.16.1 - Tokenizers 0.15.0