Edit model card

arxiv-summarization-fb-bart-base-2022-09-21

This model is a fine-tuned version of facebook/bart-base on the ccdv/arxiv-summarization dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1597
  • Rouge1: 42.9082
  • Rouge2: 15.7763
  • Rougel: 25.9239
  • Rougelsum: 37.7957
  • Gen Len: 110.5816

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.9142 0.05 10000 2.7522 17.073 6.7502 13.6779 15.6668 20.0
2.7876 0.1 20000 2.6888 16.7954 6.7038 13.4939 15.3426 19.9992
2.715 0.15 30000 2.6308 17.3324 6.8771 13.7918 15.7839 20.0
2.6431 0.2 40000 2.5858 16.7055 6.8108 13.4796 15.2959 20.0
2.6381 0.25 50000 2.5393 17.4643 7.0687 13.9507 16.012 20.0
2.6269 0.3 60000 2.5159 17.5934 7.0022 13.9394 16.0203 20.0
2.5482 0.34 70000 2.4894 17.5428 7.1822 13.9788 16.0355 20.0
2.4962 0.39 80000 2.4476 17.3587 7.1501 13.9215 15.8637 20.0
2.513 0.44 90000 2.4309 18.0806 7.5429 14.4201 16.561 20.0
2.4464 0.49 100000 2.4128 17.9813 7.5454 14.3403 16.52 19.9989
2.4969 0.54 110000 2.4114 17.3353 7.1382 13.9109 15.873 20.0
2.4417 0.59 120000 2.3866 18.0241 7.553 14.3892 16.5077 19.9980
2.4333 0.64 130000 2.3903 18.0578 7.4999 14.3901 16.5134 20.0
2.4296 0.69 140000 2.3793 17.7742 7.5182 14.2794 16.2879 20.0
2.4277 0.74 150000 2.3571 17.8015 7.4677 14.226 16.3288 20.0
2.4258 0.79 160000 2.3539 17.5335 7.399 14.09 16.0936 20.0
2.4006 0.84 170000 2.3469 17.5983 7.4285 14.1315 16.1385 20.0
2.367 0.89 180000 2.3344 17.297 7.2361 13.9286 15.8352 20.0
2.373 0.94 190000 2.3377 17.7189 7.4993 14.2603 16.2546 19.9980
2.3762 0.99 200000 2.3106 17.7883 7.4766 14.2675 16.3115 20.0
2.2538 1.03 210000 2.3197 17.4487 7.4171 14.0473 15.9771 20.0
2.268 1.08 220000 2.3044 17.9603 7.5806 14.3755 16.4328 20.0
2.2986 1.13 230000 2.3002 17.9268 7.5321 14.3503 16.4191 20.0
2.241 1.18 240000 2.3059 17.4542 7.3224 14.0578 16.0157 20.0
2.2534 1.23 250000 2.2927 17.8039 7.6232 14.2916 16.3442 20.0
2.26 1.28 260000 2.2910 17.8607 7.5645 14.318 16.3336 19.9983
2.3 1.33 270000 2.2818 17.8203 7.4815 14.3171 16.3309 20.0
2.2964 1.38 280000 2.2721 17.983 7.6867 14.3971 16.493 20.0
2.2564 1.43 290000 2.2701 18.059 7.7273 14.4806 16.5792 19.9988
2.2576 1.48 300000 2.2663 17.5706 7.4424 14.1424 16.1297 20.0
2.2605 1.53 310000 2.2607 17.8057 7.5219 14.3226 16.3355 19.9988
2.2587 1.58 320000 2.2552 18.0396 7.7064 14.5005 16.5823 20.0
2.2423 1.63 330000 2.2523 18.2229 7.8398 14.5868 16.7408 20.0
2.2793 1.68 340000 2.2431 17.6785 7.5437 14.1971 16.1724 19.9988
2.2005 1.72 350000 2.2343 17.7552 7.6026 14.2152 16.2797 19.9988
2.2454 1.77 360000 2.2339 17.9292 7.699 14.4099 16.4682 20.0
2.2175 1.82 370000 2.2345 17.7413 7.4892 14.2223 16.2442 20.0
2.238 1.87 380000 2.2259 17.6679 7.4976 14.24 16.243 19.9988
2.2108 1.92 390000 2.2210 17.8474 7.6054 14.3494 16.3635 19.9988
2.2124 1.97 400000 2.2170 17.8019 7.5182 14.264 16.3003 20.0
2.0976 2.02 410000 2.2248 17.8063 7.5383 14.2782 16.275 20.0
2.0932 2.07 420000 2.2196 17.9171 7.6187 14.3508 16.4333 20.0
2.0956 2.12 430000 2.2135 18.0616 7.7655 14.4837 16.5627 19.9988
2.0515 2.17 440000 2.2091 18.0281 7.7301 14.4696 16.5196 19.9981
2.1216 2.22 450000 2.2015 18.0609 7.7541 14.4633 16.5705 19.9988
2.1222 2.27 460000 2.1983 18.0717 7.7473 14.4725 16.5399 19.9988
2.0903 2.32 470000 2.2007 18.0751 7.7486 14.4583 16.546 20.0
2.1124 2.37 480000 2.1934 17.888 7.7124 14.3899 16.3901 20.0
2.1094 2.41 490000 2.1901 18.0254 7.7682 14.4427 16.5181 20.0
2.1085 2.46 500000 2.1924 17.9077 7.7004 14.3843 16.4221 19.9988
2.0781 2.51 510000 2.1781 18.1591 7.8456 14.565 16.6435 19.9988
2.0875 2.56 520000 2.1801 18.0389 7.7342 14.4259 16.5378 20.0
2.0945 2.61 530000 2.1758 18.0999 7.8217 14.5163 16.5784 19.9988
2.0723 2.66 540000 2.1756 17.9684 7.7369 14.4279 16.4815 19.9988
2.0918 2.71 550000 2.1738 18.1183 7.8414 14.5298 16.6119 19.9988
2.0835 2.76 560000 2.1671 17.8837 7.7379 14.3727 16.4068 19.9988
2.0936 2.81 570000 2.1670 17.9631 7.7708 14.4566 16.4823 19.9988
2.0518 2.86 580000 2.1631 18.0601 7.8112 14.5158 16.5816 19.9988
2.065 2.91 590000 2.1611 18.0548 7.8147 14.5271 16.5606 19.9988
2.0427 2.96 600000 2.1611 18.0642 7.8284 14.5293 16.5736 19.9988

Framework versions

  • Transformers 4.23.0.dev0
  • Pytorch 1.12.0
  • Datasets 2.5.1
  • Tokenizers 0.13.0
Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train farleyknight/arxiv-summarization-fb-bart-base-2022-09-21

Evaluation results