UNIST-Eunchan
commited on
Commit
•
e685f90
1
Parent(s):
57c555c
Update README.md
Browse files
README.md
CHANGED
@@ -69,12 +69,26 @@ More information needed
|
|
69 |
Paper Summarization
|
70 |
|
71 |
## Compare to Baseline
|
72 |
-
-
|
73 |
- ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-LSUM : 6.2269 | 0.7894 | 4.6905 | 5.4591
|
74 |
-
- **Our Model (Generated with length_penalty=1, num_beams=2, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, top_k=25,top_p=0.95**)
|
75 |
-
- ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-LSUM : 43.2305 | 16.6571 | 24.4315 | 33.9399
|
76 |
-
|
77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
78 |
|
79 |
## Training and evaluation data
|
80 |
|
@@ -88,8 +102,7 @@ We use huggingface-based environment such as datasets, trainer, etc.
|
|
88 |
### Training hyperparameters
|
89 |
|
90 |
The following hyperparameters were used during training:
|
91 |
-
|
92 |
-
- train_batch_size: 1
|
93 |
- eval_batch_size: 1
|
94 |
- seed: 42
|
95 |
- gradient_accumulation_steps: 64
|
@@ -97,7 +110,7 @@ The following hyperparameters were used during training:
|
|
97 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
98 |
- lr_scheduler_type: linear
|
99 |
- lr_scheduler_warmup_steps: 1586
|
100 |
-
- num_epochs: 5
|
101 |
|
102 |
|
103 |
### Framework versions
|
|
|
69 |
Paper Summarization
|
70 |
|
71 |
## Compare to Baseline
|
72 |
+
- Pegasus-X-base **zero-shot** Performance:
|
73 |
- ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-LSUM : 6.2269 | 0.7894 | 4.6905 | 5.4591
|
|
|
|
|
|
|
74 |
|
75 |
+
- **This model**
|
76 |
+
|
77 |
+
|
78 |
+
- R-1 | R-2 | R-L | R-LSUM : 43.2305 | 16.6571 | 24.4315 | 33.9399 at
|
79 |
+
```(python)
|
80 |
+
model.generate(input_ids =inputs["input_ids"].to(device),
|
81 |
+
attention_mask=inputs["attention_mask"].to(device),
|
82 |
+
length_penalty=1, num_beams=2, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, top_k=25,top_p=0.95)
|
83 |
+
|
84 |
+
```
|
85 |
+
- R-1 | R-2 | R-L | R-LSUM : 40.8486 | 16.3717 | 25.2937 | 33.6923 at
|
86 |
+
```(python)
|
87 |
+
model.generate(input_ids =inputs["input_ids"].to(device),
|
88 |
+
attention_mask=inputs["attention_mask"].to(device),
|
89 |
+
length_penalty=1, num_beams=1, max_length=128*2,top_p=1)
|
90 |
+
```
|
91 |
+
|
92 |
|
93 |
## Training and evaluation data
|
94 |
|
|
|
102 |
### Training hyperparameters
|
103 |
|
104 |
The following hyperparameters were used during training:
|
105 |
+
```learning_rate: 1e-05,train_batch_size: 1
|
|
|
106 |
- eval_batch_size: 1
|
107 |
- seed: 42
|
108 |
- gradient_accumulation_steps: 64
|
|
|
110 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
111 |
- lr_scheduler_type: linear
|
112 |
- lr_scheduler_warmup_steps: 1586
|
113 |
+
- num_epochs: 5```
|
114 |
|
115 |
|
116 |
### Framework versions
|