zwellington commited on
Commit
aa158a6
1 Parent(s): 8ec094b

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +118 -0
README.md ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: facebook/bart-large-cnn
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - clupubhealth
8
+ metrics:
9
+ - rouge
10
+ model-index:
11
+ - name: bart-cnn-pubhealth-expanded
12
+ results:
13
+ - task:
14
+ name: Sequence-to-sequence Language Modeling
15
+ type: text2text-generation
16
+ dataset:
17
+ name: clupubhealth
18
+ type: clupubhealth
19
+ config: expanded
20
+ split: test
21
+ args: expanded
22
+ metrics:
23
+ - name: Rouge1
24
+ type: rouge
25
+ value: 28.3745
26
+ ---
27
+
28
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
+ should probably proofread and complete it, then remove this comment. -->
30
+
31
+ # bart-cnn-pubhealth-expanded
32
+
33
+ This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on the clupubhealth dataset.
34
+ It achieves the following results on the evaluation set:
35
+ - Loss: 2.7286
36
+ - Rouge1: 28.3745
37
+ - Rouge2: 8.806
38
+ - Rougel: 19.3896
39
+ - Rougelsum: 20.7149
40
+ - Gen Len: 66.075
41
+
42
+ ## Model description
43
+
44
+ More information needed
45
+
46
+ ## Intended uses & limitations
47
+
48
+ More information needed
49
+
50
+ ## Training and evaluation data
51
+
52
+ More information needed
53
+
54
+ ## Training procedure
55
+
56
+ ### Training hyperparameters
57
+
58
+ The following hyperparameters were used during training:
59
+ - learning_rate: 2e-05
60
+ - train_batch_size: 16
61
+ - eval_batch_size: 8
62
+ - seed: 42
63
+ - gradient_accumulation_steps: 2
64
+ - total_train_batch_size: 32
65
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
66
+ - lr_scheduler_type: linear
67
+ - num_epochs: 10
68
+
69
+ ### Training results
70
+
71
+ | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
72
+ |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
73
+ | 2.571 | 0.26 | 500 | 2.2030 | 29.8543 | 10.1926 | 20.7137 | 21.7285 | 66.6 |
74
+ | 2.313 | 0.51 | 1000 | 2.1891 | 29.5708 | 9.5292 | 20.0823 | 21.4907 | 66.87 |
75
+ | 2.1371 | 0.77 | 1500 | 2.1981 | 29.7651 | 9.4575 | 20.412 | 21.2983 | 65.925 |
76
+ | 1.9488 | 1.03 | 2000 | 2.3023 | 29.6158 | 9.4241 | 20.6193 | 21.5966 | 64.745 |
77
+ | 1.7406 | 1.29 | 2500 | 2.2808 | 30.0862 | 9.8179 | 20.5477 | 21.4372 | 65.17 |
78
+ | 1.6732 | 1.54 | 3000 | 2.2953 | 29.65 | 9.693 | 20.3996 | 21.1837 | 64.48 |
79
+ | 1.6349 | 1.8 | 3500 | 2.3093 | 29.9081 | 9.4101 | 20.2955 | 21.381 | 64.605 |
80
+ | 1.4981 | 2.06 | 4000 | 2.3376 | 29.3183 | 9.2161 | 20.4919 | 21.3562 | 64.73 |
81
+ | 1.3951 | 2.32 | 4500 | 2.3323 | 29.9405 | 9.118 | 19.9364 | 21.1458 | 66.425 |
82
+ | 1.3775 | 2.57 | 5000 | 2.3597 | 29.1785 | 8.7657 | 19.6031 | 20.6261 | 65.505 |
83
+ | 1.3426 | 2.83 | 5500 | 2.3744 | 29.1015 | 8.9953 | 20.0223 | 21.1623 | 64.99 |
84
+ | 1.2243 | 3.09 | 6000 | 2.4723 | 28.8329 | 8.8603 | 19.9412 | 21.0484 | 65.655 |
85
+ | 1.1798 | 3.35 | 6500 | 2.4063 | 28.9035 | 8.9915 | 19.8531 | 20.9957 | 65.93 |
86
+ | 1.1926 | 3.6 | 7000 | 2.4110 | 29.4024 | 8.8828 | 19.4321 | 20.763 | 65.9 |
87
+ | 1.1791 | 3.86 | 7500 | 2.4147 | 29.8599 | 9.168 | 20.2613 | 21.4986 | 65.205 |
88
+ | 1.0545 | 4.12 | 8000 | 2.4941 | 27.9696 | 8.1513 | 19.5133 | 20.2316 | 65.26 |
89
+ | 1.0513 | 4.37 | 8500 | 2.4345 | 28.8695 | 8.7627 | 19.8116 | 20.8412 | 64.375 |
90
+ | 1.0516 | 4.63 | 9000 | 2.4550 | 29.3524 | 9.1717 | 20.0134 | 21.1516 | 65.59 |
91
+ | 1.0454 | 4.89 | 9500 | 2.4543 | 29.0709 | 8.8377 | 19.9499 | 20.9215 | 66.055 |
92
+ | 0.9247 | 5.15 | 10000 | 2.5152 | 28.8769 | 8.7619 | 19.5535 | 20.5383 | 65.455 |
93
+ | 0.9529 | 5.4 | 10500 | 2.5192 | 29.4734 | 8.6629 | 19.6803 | 20.9521 | 66.855 |
94
+ | 0.953 | 5.66 | 11000 | 2.5530 | 28.7234 | 8.5991 | 19.235 | 20.3965 | 64.62 |
95
+ | 0.9519 | 5.92 | 11500 | 2.5024 | 28.8013 | 8.8198 | 19.091 | 20.2732 | 65.16 |
96
+ | 0.8492 | 6.18 | 12000 | 2.6300 | 28.8821 | 8.974 | 20.1383 | 21.1273 | 66.16 |
97
+ | 0.8705 | 6.43 | 12500 | 2.6192 | 28.9942 | 9.0923 | 20.0151 | 20.9462 | 66.17 |
98
+ | 0.8489 | 6.69 | 13000 | 2.5758 | 28.5162 | 8.7087 | 19.6472 | 20.6057 | 68.725 |
99
+ | 0.8853 | 6.95 | 13500 | 2.5783 | 29.0936 | 8.8353 | 19.8755 | 20.867 | 65.61 |
100
+ | 0.8043 | 7.21 | 14000 | 2.6668 | 28.198 | 8.5221 | 19.2404 | 20.4359 | 66.84 |
101
+ | 0.8004 | 7.46 | 14500 | 2.6676 | 28.4951 | 8.8535 | 19.8777 | 20.8867 | 65.99 |
102
+ | 0.8067 | 7.72 | 15000 | 2.6136 | 29.2442 | 8.8243 | 19.7428 | 20.9531 | 66.265 |
103
+ | 0.8008 | 7.98 | 15500 | 2.6362 | 28.9875 | 8.8529 | 19.6993 | 20.6463 | 65.83 |
104
+ | 0.7499 | 8.23 | 16000 | 2.6987 | 29.2742 | 9.0804 | 19.8464 | 21.0735 | 65.66 |
105
+ | 0.7556 | 8.49 | 16500 | 2.6859 | 28.5046 | 8.3465 | 19.0813 | 20.2561 | 65.31 |
106
+ | 0.7574 | 8.75 | 17000 | 2.7021 | 29.2861 | 8.8262 | 19.5899 | 20.9786 | 65.735 |
107
+ | 0.7524 | 9.01 | 17500 | 2.7160 | 29.1471 | 8.9296 | 20.0009 | 21.2013 | 66.415 |
108
+ | 0.7124 | 9.26 | 18000 | 2.7418 | 28.8323 | 8.7672 | 19.5686 | 20.5814 | 67.355 |
109
+ | 0.7084 | 9.52 | 18500 | 2.7267 | 28.3833 | 8.7165 | 19.0514 | 20.3386 | 67.075 |
110
+ | 0.7251 | 9.78 | 19000 | 2.7286 | 28.3745 | 8.806 | 19.3896 | 20.7149 | 66.075 |
111
+
112
+
113
+ ### Framework versions
114
+
115
+ - Transformers 4.31.0
116
+ - Pytorch 2.0.1+cu117
117
+ - Datasets 2.7.1
118
+ - Tokenizers 0.13.2