farleyknight commited on
Commit
638b325
1 Parent(s): d4f9824

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +137 -0
README.md ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - generated_from_trainer
5
+ datasets:
6
+ - arxiv-summarization
7
+ metrics:
8
+ - rouge
9
+ model-index:
10
+ - name: arxiv-summarization-fb-bart-base-2022-09-21
11
+ results:
12
+ - task:
13
+ name: Sequence-to-sequence Language Modeling
14
+ type: text2text-generation
15
+ dataset:
16
+ name: arxiv-summarization
17
+ type: arxiv-summarization
18
+ config: section
19
+ split: train
20
+ args: section
21
+ metrics:
22
+ - name: Rouge1
23
+ type: rouge
24
+ value: 18.0642
25
+ ---
26
+
27
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
28
+ should probably proofread and complete it, then remove this comment. -->
29
+
30
+ # arxiv-summarization-fb-bart-base-2022-09-21
31
+
32
+ This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on the arxiv-summarization dataset.
33
+ It achieves the following results on the evaluation set:
34
+ - Loss: 2.1611
35
+ - Rouge1: 18.0642
36
+ - Rouge2: 7.8284
37
+ - Rougel: 14.5293
38
+ - Rougelsum: 16.5736
39
+ - Gen Len: 19.9988
40
+
41
+ ## Model description
42
+
43
+ More information needed
44
+
45
+ ## Intended uses & limitations
46
+
47
+ More information needed
48
+
49
+ ## Training and evaluation data
50
+
51
+ More information needed
52
+
53
+ ## Training procedure
54
+
55
+ ### Training hyperparameters
56
+
57
+ The following hyperparameters were used during training:
58
+ - learning_rate: 5e-05
59
+ - train_batch_size: 1
60
+ - eval_batch_size: 1
61
+ - seed: 42
62
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
63
+ - lr_scheduler_type: linear
64
+ - num_epochs: 3.0
65
+
66
+ ### Training results
67
+
68
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
69
+ |:-------------:|:-----:|:------:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:|
70
+ | 2.9142 | 0.05 | 10000 | 2.7522 | 17.073 | 6.7502 | 13.6779 | 15.6668 | 20.0 |
71
+ | 2.7876 | 0.1 | 20000 | 2.6888 | 16.7954 | 6.7038 | 13.4939 | 15.3426 | 19.9992 |
72
+ | 2.715 | 0.15 | 30000 | 2.6308 | 17.3324 | 6.8771 | 13.7918 | 15.7839 | 20.0 |
73
+ | 2.6431 | 0.2 | 40000 | 2.5858 | 16.7055 | 6.8108 | 13.4796 | 15.2959 | 20.0 |
74
+ | 2.6381 | 0.25 | 50000 | 2.5393 | 17.4643 | 7.0687 | 13.9507 | 16.012 | 20.0 |
75
+ | 2.6269 | 0.3 | 60000 | 2.5159 | 17.5934 | 7.0022 | 13.9394 | 16.0203 | 20.0 |
76
+ | 2.5482 | 0.34 | 70000 | 2.4894 | 17.5428 | 7.1822 | 13.9788 | 16.0355 | 20.0 |
77
+ | 2.4962 | 0.39 | 80000 | 2.4476 | 17.3587 | 7.1501 | 13.9215 | 15.8637 | 20.0 |
78
+ | 2.513 | 0.44 | 90000 | 2.4309 | 18.0806 | 7.5429 | 14.4201 | 16.561 | 20.0 |
79
+ | 2.4464 | 0.49 | 100000 | 2.4128 | 17.9813 | 7.5454 | 14.3403 | 16.52 | 19.9989 |
80
+ | 2.4969 | 0.54 | 110000 | 2.4114 | 17.3353 | 7.1382 | 13.9109 | 15.873 | 20.0 |
81
+ | 2.4417 | 0.59 | 120000 | 2.3866 | 18.0241 | 7.553 | 14.3892 | 16.5077 | 19.9980 |
82
+ | 2.4333 | 0.64 | 130000 | 2.3903 | 18.0578 | 7.4999 | 14.3901 | 16.5134 | 20.0 |
83
+ | 2.4296 | 0.69 | 140000 | 2.3793 | 17.7742 | 7.5182 | 14.2794 | 16.2879 | 20.0 |
84
+ | 2.4277 | 0.74 | 150000 | 2.3571 | 17.8015 | 7.4677 | 14.226 | 16.3288 | 20.0 |
85
+ | 2.4258 | 0.79 | 160000 | 2.3539 | 17.5335 | 7.399 | 14.09 | 16.0936 | 20.0 |
86
+ | 2.4006 | 0.84 | 170000 | 2.3469 | 17.5983 | 7.4285 | 14.1315 | 16.1385 | 20.0 |
87
+ | 2.367 | 0.89 | 180000 | 2.3344 | 17.297 | 7.2361 | 13.9286 | 15.8352 | 20.0 |
88
+ | 2.373 | 0.94 | 190000 | 2.3377 | 17.7189 | 7.4993 | 14.2603 | 16.2546 | 19.9980 |
89
+ | 2.3762 | 0.99 | 200000 | 2.3106 | 17.7883 | 7.4766 | 14.2675 | 16.3115 | 20.0 |
90
+ | 2.2538 | 1.03 | 210000 | 2.3197 | 17.4487 | 7.4171 | 14.0473 | 15.9771 | 20.0 |
91
+ | 2.268 | 1.08 | 220000 | 2.3044 | 17.9603 | 7.5806 | 14.3755 | 16.4328 | 20.0 |
92
+ | 2.2986 | 1.13 | 230000 | 2.3002 | 17.9268 | 7.5321 | 14.3503 | 16.4191 | 20.0 |
93
+ | 2.241 | 1.18 | 240000 | 2.3059 | 17.4542 | 7.3224 | 14.0578 | 16.0157 | 20.0 |
94
+ | 2.2534 | 1.23 | 250000 | 2.2927 | 17.8039 | 7.6232 | 14.2916 | 16.3442 | 20.0 |
95
+ | 2.26 | 1.28 | 260000 | 2.2910 | 17.8607 | 7.5645 | 14.318 | 16.3336 | 19.9983 |
96
+ | 2.3 | 1.33 | 270000 | 2.2818 | 17.8203 | 7.4815 | 14.3171 | 16.3309 | 20.0 |
97
+ | 2.2964 | 1.38 | 280000 | 2.2721 | 17.983 | 7.6867 | 14.3971 | 16.493 | 20.0 |
98
+ | 2.2564 | 1.43 | 290000 | 2.2701 | 18.059 | 7.7273 | 14.4806 | 16.5792 | 19.9988 |
99
+ | 2.2576 | 1.48 | 300000 | 2.2663 | 17.5706 | 7.4424 | 14.1424 | 16.1297 | 20.0 |
100
+ | 2.2605 | 1.53 | 310000 | 2.2607 | 17.8057 | 7.5219 | 14.3226 | 16.3355 | 19.9988 |
101
+ | 2.2587 | 1.58 | 320000 | 2.2552 | 18.0396 | 7.7064 | 14.5005 | 16.5823 | 20.0 |
102
+ | 2.2423 | 1.63 | 330000 | 2.2523 | 18.2229 | 7.8398 | 14.5868 | 16.7408 | 20.0 |
103
+ | 2.2793 | 1.68 | 340000 | 2.2431 | 17.6785 | 7.5437 | 14.1971 | 16.1724 | 19.9988 |
104
+ | 2.2005 | 1.72 | 350000 | 2.2343 | 17.7552 | 7.6026 | 14.2152 | 16.2797 | 19.9988 |
105
+ | 2.2454 | 1.77 | 360000 | 2.2339 | 17.9292 | 7.699 | 14.4099 | 16.4682 | 20.0 |
106
+ | 2.2175 | 1.82 | 370000 | 2.2345 | 17.7413 | 7.4892 | 14.2223 | 16.2442 | 20.0 |
107
+ | 2.238 | 1.87 | 380000 | 2.2259 | 17.6679 | 7.4976 | 14.24 | 16.243 | 19.9988 |
108
+ | 2.2108 | 1.92 | 390000 | 2.2210 | 17.8474 | 7.6054 | 14.3494 | 16.3635 | 19.9988 |
109
+ | 2.2124 | 1.97 | 400000 | 2.2170 | 17.8019 | 7.5182 | 14.264 | 16.3003 | 20.0 |
110
+ | 2.0976 | 2.02 | 410000 | 2.2248 | 17.8063 | 7.5383 | 14.2782 | 16.275 | 20.0 |
111
+ | 2.0932 | 2.07 | 420000 | 2.2196 | 17.9171 | 7.6187 | 14.3508 | 16.4333 | 20.0 |
112
+ | 2.0956 | 2.12 | 430000 | 2.2135 | 18.0616 | 7.7655 | 14.4837 | 16.5627 | 19.9988 |
113
+ | 2.0515 | 2.17 | 440000 | 2.2091 | 18.0281 | 7.7301 | 14.4696 | 16.5196 | 19.9981 |
114
+ | 2.1216 | 2.22 | 450000 | 2.2015 | 18.0609 | 7.7541 | 14.4633 | 16.5705 | 19.9988 |
115
+ | 2.1222 | 2.27 | 460000 | 2.1983 | 18.0717 | 7.7473 | 14.4725 | 16.5399 | 19.9988 |
116
+ | 2.0903 | 2.32 | 470000 | 2.2007 | 18.0751 | 7.7486 | 14.4583 | 16.546 | 20.0 |
117
+ | 2.1124 | 2.37 | 480000 | 2.1934 | 17.888 | 7.7124 | 14.3899 | 16.3901 | 20.0 |
118
+ | 2.1094 | 2.41 | 490000 | 2.1901 | 18.0254 | 7.7682 | 14.4427 | 16.5181 | 20.0 |
119
+ | 2.1085 | 2.46 | 500000 | 2.1924 | 17.9077 | 7.7004 | 14.3843 | 16.4221 | 19.9988 |
120
+ | 2.0781 | 2.51 | 510000 | 2.1781 | 18.1591 | 7.8456 | 14.565 | 16.6435 | 19.9988 |
121
+ | 2.0875 | 2.56 | 520000 | 2.1801 | 18.0389 | 7.7342 | 14.4259 | 16.5378 | 20.0 |
122
+ | 2.0945 | 2.61 | 530000 | 2.1758 | 18.0999 | 7.8217 | 14.5163 | 16.5784 | 19.9988 |
123
+ | 2.0723 | 2.66 | 540000 | 2.1756 | 17.9684 | 7.7369 | 14.4279 | 16.4815 | 19.9988 |
124
+ | 2.0918 | 2.71 | 550000 | 2.1738 | 18.1183 | 7.8414 | 14.5298 | 16.6119 | 19.9988 |
125
+ | 2.0835 | 2.76 | 560000 | 2.1671 | 17.8837 | 7.7379 | 14.3727 | 16.4068 | 19.9988 |
126
+ | 2.0936 | 2.81 | 570000 | 2.1670 | 17.9631 | 7.7708 | 14.4566 | 16.4823 | 19.9988 |
127
+ | 2.0518 | 2.86 | 580000 | 2.1631 | 18.0601 | 7.8112 | 14.5158 | 16.5816 | 19.9988 |
128
+ | 2.065 | 2.91 | 590000 | 2.1611 | 18.0548 | 7.8147 | 14.5271 | 16.5606 | 19.9988 |
129
+ | 2.0427 | 2.96 | 600000 | 2.1611 | 18.0642 | 7.8284 | 14.5293 | 16.5736 | 19.9988 |
130
+
131
+
132
+ ### Framework versions
133
+
134
+ - Transformers 4.23.0.dev0
135
+ - Pytorch 1.12.0
136
+ - Datasets 2.5.1
137
+ - Tokenizers 0.13.0