flan-t5-rouge-durga-q5-clean-4c
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2052
- Rouge1: 0.4357
- Rouge2: 0.2982
- Rougel: 0.4326
- Rougelsum: 0.4327
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
2.438 | 1.0 | 9 | 1.9807 | 0.2583 | 0.0732 | 0.2536 | 0.2534 |
2.6463 | 2.0 | 18 | 1.7018 | 0.2588 | 0.0705 | 0.2521 | 0.2525 |
1.7877 | 3.0 | 27 | 1.4796 | 0.2840 | 0.0824 | 0.2764 | 0.2765 |
2.0417 | 4.0 | 36 | 1.3040 | 0.3100 | 0.1072 | 0.3028 | 0.3027 |
2.0216 | 5.0 | 45 | 1.1630 | 0.3322 | 0.1262 | 0.3255 | 0.3246 |
1.7093 | 6.0 | 54 | 1.0289 | 0.3359 | 0.1283 | 0.3283 | 0.3285 |
1.6109 | 7.0 | 63 | 0.9288 | 0.3728 | 0.1752 | 0.3631 | 0.3628 |
1.3041 | 8.0 | 72 | 0.8358 | 0.3691 | 0.1709 | 0.3592 | 0.3593 |
1.3242 | 9.0 | 81 | 0.7609 | 0.3666 | 0.1744 | 0.3573 | 0.3579 |
1.0971 | 10.0 | 90 | 0.6803 | 0.3724 | 0.1809 | 0.3659 | 0.3663 |
0.7156 | 11.0 | 99 | 0.6153 | 0.3742 | 0.1833 | 0.3634 | 0.3637 |
0.8419 | 12.0 | 108 | 0.5537 | 0.3748 | 0.1870 | 0.3645 | 0.3655 |
0.8853 | 13.0 | 117 | 0.5012 | 0.3775 | 0.1986 | 0.3681 | 0.3687 |
1.0922 | 14.0 | 126 | 0.4396 | 0.3738 | 0.1960 | 0.3629 | 0.3634 |
0.8752 | 15.0 | 135 | 0.4022 | 0.3844 | 0.2097 | 0.3755 | 0.3762 |
0.8189 | 16.0 | 144 | 0.3810 | 0.4050 | 0.2350 | 0.3970 | 0.3975 |
0.639 | 17.0 | 153 | 0.3503 | 0.4039 | 0.2341 | 0.3977 | 0.3976 |
0.7971 | 18.0 | 162 | 0.3162 | 0.4082 | 0.2428 | 0.4022 | 0.4028 |
0.7211 | 19.0 | 171 | 0.3069 | 0.4174 | 0.2504 | 0.4131 | 0.4128 |
0.7633 | 20.0 | 180 | 0.2804 | 0.4204 | 0.2562 | 0.4154 | 0.4167 |
0.6475 | 21.0 | 189 | 0.2685 | 0.4308 | 0.2750 | 0.4269 | 0.4274 |
0.5642 | 22.0 | 198 | 0.2498 | 0.4232 | 0.2700 | 0.4175 | 0.4184 |
0.66 | 23.0 | 207 | 0.2377 | 0.4311 | 0.2832 | 0.4246 | 0.4249 |
0.6004 | 24.0 | 216 | 0.2335 | 0.4298 | 0.2868 | 0.4255 | 0.4257 |
0.6263 | 25.0 | 225 | 0.2216 | 0.4252 | 0.2806 | 0.4211 | 0.4212 |
0.4931 | 26.0 | 234 | 0.2146 | 0.4274 | 0.2858 | 0.4232 | 0.4236 |
0.5072 | 27.0 | 243 | 0.2091 | 0.4309 | 0.2862 | 0.4266 | 0.4267 |
0.5079 | 28.0 | 252 | 0.2069 | 0.4354 | 0.2969 | 0.4315 | 0.4324 |
0.494 | 29.0 | 261 | 0.2058 | 0.4326 | 0.2965 | 0.4290 | 0.4299 |
0.6008 | 30.0 | 270 | 0.2052 | 0.4357 | 0.2982 | 0.4326 | 0.4327 |
Framework versions
- Transformers 4.46.0
- Pytorch 2.5.0+cu121
- Datasets 3.0.2
- Tokenizers 0.20.1
- Downloads last month
- 110
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for devagonal/flan-t5-rouge-durga-q5-clean-4c
Base model
google/flan-t5-base