Edit model card

my_awesome_billsum_model_68

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2186
  • Rouge1: 0.9718
  • Rouge2: 0.8861
  • Rougel: 0.9312
  • Rougelsum: 0.9298
  • Gen Len: 5.0625

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 12 2.0043 0.3937 0.2702 0.3788 0.3776 17.75
No log 2.0 24 1.4138 0.4258 0.2978 0.4009 0.3998 16.8333
No log 3.0 36 0.8103 0.5858 0.4637 0.5658 0.5614 12.7083
No log 4.0 48 0.5156 0.9539 0.8354 0.8948 0.8934 4.8542
No log 5.0 60 0.4552 0.9539 0.8354 0.8948 0.8934 4.8542
No log 6.0 72 0.4053 0.965 0.8514 0.9092 0.9055 4.8958
No log 7.0 84 0.3565 0.968 0.8653 0.9144 0.9144 4.9167
No log 8.0 96 0.3263 0.968 0.8653 0.9144 0.9144 4.9167
No log 9.0 108 0.2998 0.968 0.8653 0.9144 0.9144 4.9167
No log 10.0 120 0.2807 0.967 0.8788 0.9273 0.9253 4.8958
No log 11.0 132 0.2694 0.967 0.8788 0.9273 0.9253 4.8958
No log 12.0 144 0.2622 0.967 0.8795 0.9273 0.9253 4.9375
No log 13.0 156 0.2490 0.967 0.8795 0.9273 0.9253 4.9375
No log 14.0 168 0.2427 0.967 0.8795 0.9273 0.9253 4.9375
No log 15.0 180 0.2385 0.967 0.8795 0.9273 0.9253 4.9375
No log 16.0 192 0.2350 0.967 0.8795 0.9273 0.9253 4.9375
No log 17.0 204 0.2284 0.967 0.8795 0.9273 0.9253 4.9375
No log 18.0 216 0.2212 0.967 0.8795 0.9273 0.9253 4.9375
No log 19.0 228 0.2173 0.97 0.892 0.936 0.9343 4.9583
No log 20.0 240 0.2177 0.97 0.892 0.936 0.9343 4.9583
No log 21.0 252 0.2161 0.97 0.892 0.936 0.9343 4.9583
No log 22.0 264 0.2111 0.97 0.892 0.936 0.9343 4.9583
No log 23.0 276 0.2072 0.967 0.8799 0.9273 0.9271 4.9792
No log 24.0 288 0.2066 0.97 0.892 0.936 0.9343 4.9583
No log 25.0 300 0.2068 0.973 0.9146 0.9464 0.9435 4.9792
No log 26.0 312 0.2080 0.97 0.892 0.936 0.9343 4.9583
No log 27.0 324 0.2078 0.97 0.892 0.936 0.9343 4.9583
No log 28.0 336 0.1976 0.973 0.8993 0.9346 0.9328 4.9792
No log 29.0 348 0.1921 0.973 0.8993 0.9346 0.9328 4.9792
No log 30.0 360 0.1896 0.973 0.8993 0.9346 0.9328 4.9792
No log 31.0 372 0.1906 0.9686 0.8792 0.9223 0.9204 5.0
No log 32.0 384 0.1942 0.973 0.8993 0.9346 0.9328 4.9792
No log 33.0 396 0.1976 0.97 0.8868 0.926 0.9253 5.0
No log 34.0 408 0.2006 0.97 0.9021 0.9363 0.9353 5.0
No log 35.0 420 0.1983 0.97 0.9021 0.9363 0.9353 5.0
No log 36.0 432 0.2010 0.967 0.8799 0.9273 0.9271 4.9792
No log 37.0 444 0.2014 0.97 0.9021 0.9363 0.9353 5.0
No log 38.0 456 0.2027 0.97 0.9021 0.9363 0.9353 5.0
No log 39.0 468 0.2059 0.97 0.9021 0.9363 0.9353 5.0
No log 40.0 480 0.2035 0.97 0.9021 0.9363 0.9353 5.0
No log 41.0 492 0.1989 0.97 0.8937 0.9363 0.9353 5.0
0.4765 42.0 504 0.1969 0.973 0.892 0.9346 0.933 5.0208
0.4765 43.0 516 0.1958 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 44.0 528 0.1937 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 45.0 540 0.1922 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 46.0 552 0.1940 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 47.0 564 0.1944 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 48.0 576 0.1943 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 49.0 588 0.1985 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 50.0 600 0.2034 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 51.0 612 0.2071 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 52.0 624 0.2113 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 53.0 636 0.2115 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 54.0 648 0.2104 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 55.0 660 0.2109 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 56.0 672 0.2114 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 57.0 684 0.2127 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 58.0 696 0.2149 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 59.0 708 0.2154 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 60.0 720 0.2187 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 61.0 732 0.2193 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 62.0 744 0.2200 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 63.0 756 0.2203 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 64.0 768 0.2192 0.9718 0.8861 0.9326 0.9296 5.0625
0.4765 65.0 780 0.2185 0.9718 0.8708 0.9204 0.9193 5.0625
0.4765 66.0 792 0.2189 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 67.0 804 0.2186 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 68.0 816 0.2181 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 69.0 828 0.2176 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 70.0 840 0.2193 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 71.0 852 0.2198 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 72.0 864 0.2202 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 73.0 876 0.2193 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 74.0 888 0.2191 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 75.0 900 0.2208 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 76.0 912 0.2206 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 77.0 924 0.2193 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 78.0 936 0.2183 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 79.0 948 0.2185 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 80.0 960 0.2176 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 81.0 972 0.2175 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 82.0 984 0.2181 0.9718 0.8861 0.9312 0.9298 5.0625
0.4765 83.0 996 0.2184 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 84.0 1008 0.2172 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 85.0 1020 0.2177 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 86.0 1032 0.2175 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 87.0 1044 0.2180 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 88.0 1056 0.2180 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 89.0 1068 0.2181 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 90.0 1080 0.2179 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 91.0 1092 0.2178 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 92.0 1104 0.2179 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 93.0 1116 0.2175 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 94.0 1128 0.2179 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 95.0 1140 0.2181 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 96.0 1152 0.2182 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 97.0 1164 0.2184 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 98.0 1176 0.2186 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 99.0 1188 0.2186 0.9718 0.8861 0.9312 0.9298 5.0625
0.1106 100.0 1200 0.2186 0.9718 0.8861 0.9312 0.9298 5.0625

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
60.5M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from