farleyknight
commited on
Commit
•
638b325
1
Parent(s):
d4f9824
update model card README.md
Browse files
README.md
ADDED
@@ -0,0 +1,137 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- generated_from_trainer
|
5 |
+
datasets:
|
6 |
+
- arxiv-summarization
|
7 |
+
metrics:
|
8 |
+
- rouge
|
9 |
+
model-index:
|
10 |
+
- name: arxiv-summarization-fb-bart-base-2022-09-21
|
11 |
+
results:
|
12 |
+
- task:
|
13 |
+
name: Sequence-to-sequence Language Modeling
|
14 |
+
type: text2text-generation
|
15 |
+
dataset:
|
16 |
+
name: arxiv-summarization
|
17 |
+
type: arxiv-summarization
|
18 |
+
config: section
|
19 |
+
split: train
|
20 |
+
args: section
|
21 |
+
metrics:
|
22 |
+
- name: Rouge1
|
23 |
+
type: rouge
|
24 |
+
value: 18.0642
|
25 |
+
---
|
26 |
+
|
27 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
28 |
+
should probably proofread and complete it, then remove this comment. -->
|
29 |
+
|
30 |
+
# arxiv-summarization-fb-bart-base-2022-09-21
|
31 |
+
|
32 |
+
This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on the arxiv-summarization dataset.
|
33 |
+
It achieves the following results on the evaluation set:
|
34 |
+
- Loss: 2.1611
|
35 |
+
- Rouge1: 18.0642
|
36 |
+
- Rouge2: 7.8284
|
37 |
+
- Rougel: 14.5293
|
38 |
+
- Rougelsum: 16.5736
|
39 |
+
- Gen Len: 19.9988
|
40 |
+
|
41 |
+
## Model description
|
42 |
+
|
43 |
+
More information needed
|
44 |
+
|
45 |
+
## Intended uses & limitations
|
46 |
+
|
47 |
+
More information needed
|
48 |
+
|
49 |
+
## Training and evaluation data
|
50 |
+
|
51 |
+
More information needed
|
52 |
+
|
53 |
+
## Training procedure
|
54 |
+
|
55 |
+
### Training hyperparameters
|
56 |
+
|
57 |
+
The following hyperparameters were used during training:
|
58 |
+
- learning_rate: 5e-05
|
59 |
+
- train_batch_size: 1
|
60 |
+
- eval_batch_size: 1
|
61 |
+
- seed: 42
|
62 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
63 |
+
- lr_scheduler_type: linear
|
64 |
+
- num_epochs: 3.0
|
65 |
+
|
66 |
+
### Training results
|
67 |
+
|
68 |
+
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
69 |
+
|:-------------:|:-----:|:------:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:|
|
70 |
+
| 2.9142 | 0.05 | 10000 | 2.7522 | 17.073 | 6.7502 | 13.6779 | 15.6668 | 20.0 |
|
71 |
+
| 2.7876 | 0.1 | 20000 | 2.6888 | 16.7954 | 6.7038 | 13.4939 | 15.3426 | 19.9992 |
|
72 |
+
| 2.715 | 0.15 | 30000 | 2.6308 | 17.3324 | 6.8771 | 13.7918 | 15.7839 | 20.0 |
|
73 |
+
| 2.6431 | 0.2 | 40000 | 2.5858 | 16.7055 | 6.8108 | 13.4796 | 15.2959 | 20.0 |
|
74 |
+
| 2.6381 | 0.25 | 50000 | 2.5393 | 17.4643 | 7.0687 | 13.9507 | 16.012 | 20.0 |
|
75 |
+
| 2.6269 | 0.3 | 60000 | 2.5159 | 17.5934 | 7.0022 | 13.9394 | 16.0203 | 20.0 |
|
76 |
+
| 2.5482 | 0.34 | 70000 | 2.4894 | 17.5428 | 7.1822 | 13.9788 | 16.0355 | 20.0 |
|
77 |
+
| 2.4962 | 0.39 | 80000 | 2.4476 | 17.3587 | 7.1501 | 13.9215 | 15.8637 | 20.0 |
|
78 |
+
| 2.513 | 0.44 | 90000 | 2.4309 | 18.0806 | 7.5429 | 14.4201 | 16.561 | 20.0 |
|
79 |
+
| 2.4464 | 0.49 | 100000 | 2.4128 | 17.9813 | 7.5454 | 14.3403 | 16.52 | 19.9989 |
|
80 |
+
| 2.4969 | 0.54 | 110000 | 2.4114 | 17.3353 | 7.1382 | 13.9109 | 15.873 | 20.0 |
|
81 |
+
| 2.4417 | 0.59 | 120000 | 2.3866 | 18.0241 | 7.553 | 14.3892 | 16.5077 | 19.9980 |
|
82 |
+
| 2.4333 | 0.64 | 130000 | 2.3903 | 18.0578 | 7.4999 | 14.3901 | 16.5134 | 20.0 |
|
83 |
+
| 2.4296 | 0.69 | 140000 | 2.3793 | 17.7742 | 7.5182 | 14.2794 | 16.2879 | 20.0 |
|
84 |
+
| 2.4277 | 0.74 | 150000 | 2.3571 | 17.8015 | 7.4677 | 14.226 | 16.3288 | 20.0 |
|
85 |
+
| 2.4258 | 0.79 | 160000 | 2.3539 | 17.5335 | 7.399 | 14.09 | 16.0936 | 20.0 |
|
86 |
+
| 2.4006 | 0.84 | 170000 | 2.3469 | 17.5983 | 7.4285 | 14.1315 | 16.1385 | 20.0 |
|
87 |
+
| 2.367 | 0.89 | 180000 | 2.3344 | 17.297 | 7.2361 | 13.9286 | 15.8352 | 20.0 |
|
88 |
+
| 2.373 | 0.94 | 190000 | 2.3377 | 17.7189 | 7.4993 | 14.2603 | 16.2546 | 19.9980 |
|
89 |
+
| 2.3762 | 0.99 | 200000 | 2.3106 | 17.7883 | 7.4766 | 14.2675 | 16.3115 | 20.0 |
|
90 |
+
| 2.2538 | 1.03 | 210000 | 2.3197 | 17.4487 | 7.4171 | 14.0473 | 15.9771 | 20.0 |
|
91 |
+
| 2.268 | 1.08 | 220000 | 2.3044 | 17.9603 | 7.5806 | 14.3755 | 16.4328 | 20.0 |
|
92 |
+
| 2.2986 | 1.13 | 230000 | 2.3002 | 17.9268 | 7.5321 | 14.3503 | 16.4191 | 20.0 |
|
93 |
+
| 2.241 | 1.18 | 240000 | 2.3059 | 17.4542 | 7.3224 | 14.0578 | 16.0157 | 20.0 |
|
94 |
+
| 2.2534 | 1.23 | 250000 | 2.2927 | 17.8039 | 7.6232 | 14.2916 | 16.3442 | 20.0 |
|
95 |
+
| 2.26 | 1.28 | 260000 | 2.2910 | 17.8607 | 7.5645 | 14.318 | 16.3336 | 19.9983 |
|
96 |
+
| 2.3 | 1.33 | 270000 | 2.2818 | 17.8203 | 7.4815 | 14.3171 | 16.3309 | 20.0 |
|
97 |
+
| 2.2964 | 1.38 | 280000 | 2.2721 | 17.983 | 7.6867 | 14.3971 | 16.493 | 20.0 |
|
98 |
+
| 2.2564 | 1.43 | 290000 | 2.2701 | 18.059 | 7.7273 | 14.4806 | 16.5792 | 19.9988 |
|
99 |
+
| 2.2576 | 1.48 | 300000 | 2.2663 | 17.5706 | 7.4424 | 14.1424 | 16.1297 | 20.0 |
|
100 |
+
| 2.2605 | 1.53 | 310000 | 2.2607 | 17.8057 | 7.5219 | 14.3226 | 16.3355 | 19.9988 |
|
101 |
+
| 2.2587 | 1.58 | 320000 | 2.2552 | 18.0396 | 7.7064 | 14.5005 | 16.5823 | 20.0 |
|
102 |
+
| 2.2423 | 1.63 | 330000 | 2.2523 | 18.2229 | 7.8398 | 14.5868 | 16.7408 | 20.0 |
|
103 |
+
| 2.2793 | 1.68 | 340000 | 2.2431 | 17.6785 | 7.5437 | 14.1971 | 16.1724 | 19.9988 |
|
104 |
+
| 2.2005 | 1.72 | 350000 | 2.2343 | 17.7552 | 7.6026 | 14.2152 | 16.2797 | 19.9988 |
|
105 |
+
| 2.2454 | 1.77 | 360000 | 2.2339 | 17.9292 | 7.699 | 14.4099 | 16.4682 | 20.0 |
|
106 |
+
| 2.2175 | 1.82 | 370000 | 2.2345 | 17.7413 | 7.4892 | 14.2223 | 16.2442 | 20.0 |
|
107 |
+
| 2.238 | 1.87 | 380000 | 2.2259 | 17.6679 | 7.4976 | 14.24 | 16.243 | 19.9988 |
|
108 |
+
| 2.2108 | 1.92 | 390000 | 2.2210 | 17.8474 | 7.6054 | 14.3494 | 16.3635 | 19.9988 |
|
109 |
+
| 2.2124 | 1.97 | 400000 | 2.2170 | 17.8019 | 7.5182 | 14.264 | 16.3003 | 20.0 |
|
110 |
+
| 2.0976 | 2.02 | 410000 | 2.2248 | 17.8063 | 7.5383 | 14.2782 | 16.275 | 20.0 |
|
111 |
+
| 2.0932 | 2.07 | 420000 | 2.2196 | 17.9171 | 7.6187 | 14.3508 | 16.4333 | 20.0 |
|
112 |
+
| 2.0956 | 2.12 | 430000 | 2.2135 | 18.0616 | 7.7655 | 14.4837 | 16.5627 | 19.9988 |
|
113 |
+
| 2.0515 | 2.17 | 440000 | 2.2091 | 18.0281 | 7.7301 | 14.4696 | 16.5196 | 19.9981 |
|
114 |
+
| 2.1216 | 2.22 | 450000 | 2.2015 | 18.0609 | 7.7541 | 14.4633 | 16.5705 | 19.9988 |
|
115 |
+
| 2.1222 | 2.27 | 460000 | 2.1983 | 18.0717 | 7.7473 | 14.4725 | 16.5399 | 19.9988 |
|
116 |
+
| 2.0903 | 2.32 | 470000 | 2.2007 | 18.0751 | 7.7486 | 14.4583 | 16.546 | 20.0 |
|
117 |
+
| 2.1124 | 2.37 | 480000 | 2.1934 | 17.888 | 7.7124 | 14.3899 | 16.3901 | 20.0 |
|
118 |
+
| 2.1094 | 2.41 | 490000 | 2.1901 | 18.0254 | 7.7682 | 14.4427 | 16.5181 | 20.0 |
|
119 |
+
| 2.1085 | 2.46 | 500000 | 2.1924 | 17.9077 | 7.7004 | 14.3843 | 16.4221 | 19.9988 |
|
120 |
+
| 2.0781 | 2.51 | 510000 | 2.1781 | 18.1591 | 7.8456 | 14.565 | 16.6435 | 19.9988 |
|
121 |
+
| 2.0875 | 2.56 | 520000 | 2.1801 | 18.0389 | 7.7342 | 14.4259 | 16.5378 | 20.0 |
|
122 |
+
| 2.0945 | 2.61 | 530000 | 2.1758 | 18.0999 | 7.8217 | 14.5163 | 16.5784 | 19.9988 |
|
123 |
+
| 2.0723 | 2.66 | 540000 | 2.1756 | 17.9684 | 7.7369 | 14.4279 | 16.4815 | 19.9988 |
|
124 |
+
| 2.0918 | 2.71 | 550000 | 2.1738 | 18.1183 | 7.8414 | 14.5298 | 16.6119 | 19.9988 |
|
125 |
+
| 2.0835 | 2.76 | 560000 | 2.1671 | 17.8837 | 7.7379 | 14.3727 | 16.4068 | 19.9988 |
|
126 |
+
| 2.0936 | 2.81 | 570000 | 2.1670 | 17.9631 | 7.7708 | 14.4566 | 16.4823 | 19.9988 |
|
127 |
+
| 2.0518 | 2.86 | 580000 | 2.1631 | 18.0601 | 7.8112 | 14.5158 | 16.5816 | 19.9988 |
|
128 |
+
| 2.065 | 2.91 | 590000 | 2.1611 | 18.0548 | 7.8147 | 14.5271 | 16.5606 | 19.9988 |
|
129 |
+
| 2.0427 | 2.96 | 600000 | 2.1611 | 18.0642 | 7.8284 | 14.5293 | 16.5736 | 19.9988 |
|
130 |
+
|
131 |
+
|
132 |
+
### Framework versions
|
133 |
+
|
134 |
+
- Transformers 4.23.0.dev0
|
135 |
+
- Pytorch 1.12.0
|
136 |
+
- Datasets 2.5.1
|
137 |
+
- Tokenizers 0.13.0
|