arxiv-summarization-fb-bart-base-2022-09-21
This model is a fine-tuned version of facebook/bart-base on the ccdv/arxiv-summarization dataset. It achieves the following results on the evaluation set:
- Loss: 2.1597
- Rouge1: 42.9082
- Rouge2: 15.7763
- Rougel: 25.9239
- Rougelsum: 37.7957
- Gen Len: 110.5816
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
2.9142 | 0.05 | 10000 | 2.7522 | 17.073 | 6.7502 | 13.6779 | 15.6668 | 20.0 |
2.7876 | 0.1 | 20000 | 2.6888 | 16.7954 | 6.7038 | 13.4939 | 15.3426 | 19.9992 |
2.715 | 0.15 | 30000 | 2.6308 | 17.3324 | 6.8771 | 13.7918 | 15.7839 | 20.0 |
2.6431 | 0.2 | 40000 | 2.5858 | 16.7055 | 6.8108 | 13.4796 | 15.2959 | 20.0 |
2.6381 | 0.25 | 50000 | 2.5393 | 17.4643 | 7.0687 | 13.9507 | 16.012 | 20.0 |
2.6269 | 0.3 | 60000 | 2.5159 | 17.5934 | 7.0022 | 13.9394 | 16.0203 | 20.0 |
2.5482 | 0.34 | 70000 | 2.4894 | 17.5428 | 7.1822 | 13.9788 | 16.0355 | 20.0 |
2.4962 | 0.39 | 80000 | 2.4476 | 17.3587 | 7.1501 | 13.9215 | 15.8637 | 20.0 |
2.513 | 0.44 | 90000 | 2.4309 | 18.0806 | 7.5429 | 14.4201 | 16.561 | 20.0 |
2.4464 | 0.49 | 100000 | 2.4128 | 17.9813 | 7.5454 | 14.3403 | 16.52 | 19.9989 |
2.4969 | 0.54 | 110000 | 2.4114 | 17.3353 | 7.1382 | 13.9109 | 15.873 | 20.0 |
2.4417 | 0.59 | 120000 | 2.3866 | 18.0241 | 7.553 | 14.3892 | 16.5077 | 19.9980 |
2.4333 | 0.64 | 130000 | 2.3903 | 18.0578 | 7.4999 | 14.3901 | 16.5134 | 20.0 |
2.4296 | 0.69 | 140000 | 2.3793 | 17.7742 | 7.5182 | 14.2794 | 16.2879 | 20.0 |
2.4277 | 0.74 | 150000 | 2.3571 | 17.8015 | 7.4677 | 14.226 | 16.3288 | 20.0 |
2.4258 | 0.79 | 160000 | 2.3539 | 17.5335 | 7.399 | 14.09 | 16.0936 | 20.0 |
2.4006 | 0.84 | 170000 | 2.3469 | 17.5983 | 7.4285 | 14.1315 | 16.1385 | 20.0 |
2.367 | 0.89 | 180000 | 2.3344 | 17.297 | 7.2361 | 13.9286 | 15.8352 | 20.0 |
2.373 | 0.94 | 190000 | 2.3377 | 17.7189 | 7.4993 | 14.2603 | 16.2546 | 19.9980 |
2.3762 | 0.99 | 200000 | 2.3106 | 17.7883 | 7.4766 | 14.2675 | 16.3115 | 20.0 |
2.2538 | 1.03 | 210000 | 2.3197 | 17.4487 | 7.4171 | 14.0473 | 15.9771 | 20.0 |
2.268 | 1.08 | 220000 | 2.3044 | 17.9603 | 7.5806 | 14.3755 | 16.4328 | 20.0 |
2.2986 | 1.13 | 230000 | 2.3002 | 17.9268 | 7.5321 | 14.3503 | 16.4191 | 20.0 |
2.241 | 1.18 | 240000 | 2.3059 | 17.4542 | 7.3224 | 14.0578 | 16.0157 | 20.0 |
2.2534 | 1.23 | 250000 | 2.2927 | 17.8039 | 7.6232 | 14.2916 | 16.3442 | 20.0 |
2.26 | 1.28 | 260000 | 2.2910 | 17.8607 | 7.5645 | 14.318 | 16.3336 | 19.9983 |
2.3 | 1.33 | 270000 | 2.2818 | 17.8203 | 7.4815 | 14.3171 | 16.3309 | 20.0 |
2.2964 | 1.38 | 280000 | 2.2721 | 17.983 | 7.6867 | 14.3971 | 16.493 | 20.0 |
2.2564 | 1.43 | 290000 | 2.2701 | 18.059 | 7.7273 | 14.4806 | 16.5792 | 19.9988 |
2.2576 | 1.48 | 300000 | 2.2663 | 17.5706 | 7.4424 | 14.1424 | 16.1297 | 20.0 |
2.2605 | 1.53 | 310000 | 2.2607 | 17.8057 | 7.5219 | 14.3226 | 16.3355 | 19.9988 |
2.2587 | 1.58 | 320000 | 2.2552 | 18.0396 | 7.7064 | 14.5005 | 16.5823 | 20.0 |
2.2423 | 1.63 | 330000 | 2.2523 | 18.2229 | 7.8398 | 14.5868 | 16.7408 | 20.0 |
2.2793 | 1.68 | 340000 | 2.2431 | 17.6785 | 7.5437 | 14.1971 | 16.1724 | 19.9988 |
2.2005 | 1.72 | 350000 | 2.2343 | 17.7552 | 7.6026 | 14.2152 | 16.2797 | 19.9988 |
2.2454 | 1.77 | 360000 | 2.2339 | 17.9292 | 7.699 | 14.4099 | 16.4682 | 20.0 |
2.2175 | 1.82 | 370000 | 2.2345 | 17.7413 | 7.4892 | 14.2223 | 16.2442 | 20.0 |
2.238 | 1.87 | 380000 | 2.2259 | 17.6679 | 7.4976 | 14.24 | 16.243 | 19.9988 |
2.2108 | 1.92 | 390000 | 2.2210 | 17.8474 | 7.6054 | 14.3494 | 16.3635 | 19.9988 |
2.2124 | 1.97 | 400000 | 2.2170 | 17.8019 | 7.5182 | 14.264 | 16.3003 | 20.0 |
2.0976 | 2.02 | 410000 | 2.2248 | 17.8063 | 7.5383 | 14.2782 | 16.275 | 20.0 |
2.0932 | 2.07 | 420000 | 2.2196 | 17.9171 | 7.6187 | 14.3508 | 16.4333 | 20.0 |
2.0956 | 2.12 | 430000 | 2.2135 | 18.0616 | 7.7655 | 14.4837 | 16.5627 | 19.9988 |
2.0515 | 2.17 | 440000 | 2.2091 | 18.0281 | 7.7301 | 14.4696 | 16.5196 | 19.9981 |
2.1216 | 2.22 | 450000 | 2.2015 | 18.0609 | 7.7541 | 14.4633 | 16.5705 | 19.9988 |
2.1222 | 2.27 | 460000 | 2.1983 | 18.0717 | 7.7473 | 14.4725 | 16.5399 | 19.9988 |
2.0903 | 2.32 | 470000 | 2.2007 | 18.0751 | 7.7486 | 14.4583 | 16.546 | 20.0 |
2.1124 | 2.37 | 480000 | 2.1934 | 17.888 | 7.7124 | 14.3899 | 16.3901 | 20.0 |
2.1094 | 2.41 | 490000 | 2.1901 | 18.0254 | 7.7682 | 14.4427 | 16.5181 | 20.0 |
2.1085 | 2.46 | 500000 | 2.1924 | 17.9077 | 7.7004 | 14.3843 | 16.4221 | 19.9988 |
2.0781 | 2.51 | 510000 | 2.1781 | 18.1591 | 7.8456 | 14.565 | 16.6435 | 19.9988 |
2.0875 | 2.56 | 520000 | 2.1801 | 18.0389 | 7.7342 | 14.4259 | 16.5378 | 20.0 |
2.0945 | 2.61 | 530000 | 2.1758 | 18.0999 | 7.8217 | 14.5163 | 16.5784 | 19.9988 |
2.0723 | 2.66 | 540000 | 2.1756 | 17.9684 | 7.7369 | 14.4279 | 16.4815 | 19.9988 |
2.0918 | 2.71 | 550000 | 2.1738 | 18.1183 | 7.8414 | 14.5298 | 16.6119 | 19.9988 |
2.0835 | 2.76 | 560000 | 2.1671 | 17.8837 | 7.7379 | 14.3727 | 16.4068 | 19.9988 |
2.0936 | 2.81 | 570000 | 2.1670 | 17.9631 | 7.7708 | 14.4566 | 16.4823 | 19.9988 |
2.0518 | 2.86 | 580000 | 2.1631 | 18.0601 | 7.8112 | 14.5158 | 16.5816 | 19.9988 |
2.065 | 2.91 | 590000 | 2.1611 | 18.0548 | 7.8147 | 14.5271 | 16.5606 | 19.9988 |
2.0427 | 2.96 | 600000 | 2.1611 | 18.0642 | 7.8284 | 14.5293 | 16.5736 | 19.9988 |
Framework versions
- Transformers 4.23.0.dev0
- Pytorch 1.12.0
- Datasets 2.5.1
- Tokenizers 0.13.0
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.