--- base_model: google/flan-t5-large license: apache-2.0 metrics: - rouge tags: - generated_from_trainer model-index: - name: flan-t5-large-summary results: [] --- # flan-t5-large-summary This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.3277 - Rouge1: 69.554 - Rouge2: 59.2044 - Rougel: 67.6581 - Rougelsum: 67.7174 - Gen Len: 18.7309 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 20 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| | No log | 1.0 | 190 | 0.3832 | 67.2926 | 53.7031 | 64.7974 | 64.8071 | 18.8179 | | No log | 2.0 | 380 | 0.3090 | 67.8051 | 55.3515 | 65.6548 | 65.6747 | 18.8232 | | 0.2632 | 3.0 | 570 | 0.2876 | 68.3985 | 56.4896 | 66.3492 | 66.3648 | 18.7889 | | 0.2632 | 4.0 | 760 | 0.2789 | 68.5676 | 56.4035 | 66.101 | 66.1396 | 18.8100 | | 0.2632 | 5.0 | 950 | 0.2805 | 68.2045 | 56.6864 | 66.2411 | 66.3045 | 18.7784 | | 0.2625 | 6.0 | 1140 | 0.2827 | 68.4741 | 57.1103 | 66.101 | 66.1991 | 18.7810 | | 0.2625 | 7.0 | 1330 | 0.2767 | 68.9503 | 57.7255 | 66.8972 | 66.9236 | 18.7916 | | 0.2048 | 8.0 | 1520 | 0.2776 | 69.1733 | 57.5766 | 66.7948 | 66.9347 | 18.7968 | | 0.2048 | 9.0 | 1710 | 0.2777 | 69.7098 | 58.867 | 67.8471 | 67.8675 | 18.7388 | | 0.2048 | 10.0 | 1900 | 0.2855 | 69.2753 | 57.8341 | 67.1037 | 67.2396 | 18.7757 | | 0.1653 | 11.0 | 2090 | 0.2819 | 69.7836 | 58.5212 | 67.6087 | 67.6426 | 18.7916 | | 0.1653 | 12.0 | 2280 | 0.2918 | 70.0288 | 59.2524 | 68.0356 | 68.0754 | 18.7573 | | 0.1653 | 13.0 | 2470 | 0.2940 | 69.7288 | 59.2246 | 67.8426 | 67.8537 | 18.7704 | | 0.1421 | 14.0 | 2660 | 0.2995 | 69.6257 | 58.8437 | 67.6509 | 67.6521 | 18.7810 | | 0.1421 | 15.0 | 2850 | 0.3042 | 69.5083 | 58.6743 | 67.4254 | 67.5589 | 18.7625 | | 0.1201 | 16.0 | 3040 | 0.3114 | 69.3398 | 58.9756 | 67.4566 | 67.5762 | 18.7810 | | 0.1201 | 17.0 | 3230 | 0.3158 | 69.3569 | 58.8509 | 67.3013 | 67.4321 | 18.7520 | | 0.1201 | 18.0 | 3420 | 0.3215 | 69.7484 | 59.1145 | 67.6617 | 67.6992 | 18.7573 | | 0.1076 | 19.0 | 3610 | 0.3254 | 69.5556 | 59.2392 | 67.7187 | 67.7966 | 18.7520 | | 0.1076 | 20.0 | 3800 | 0.3277 | 69.554 | 59.2044 | 67.6581 | 67.7174 | 18.7309 | ### Framework versions - Transformers 4.43.2 - Pytorch 2.2.0a0+81ea7a4 - Datasets 2.20.0 - Tokenizers 0.19.1