UNIST-Eunchan commited on
Commit
e685f90
1 Parent(s): 57c555c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -7
README.md CHANGED
@@ -69,12 +69,26 @@ More information needed
69
  Paper Summarization
70
 
71
  ## Compare to Baseline
72
- - **Pegasus-X-base Zero-shot Performance:**
73
  - ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-LSUM : 6.2269 | 0.7894 | 4.6905 | 5.4591
74
- - **Our Model (Generated with length_penalty=1, num_beams=2, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, top_k=25,top_p=0.95**)
75
- - ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-LSUM : 43.2305 | 16.6571 | 24.4315 | 33.9399
76
-
77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
 
79
  ## Training and evaluation data
80
 
@@ -88,8 +102,7 @@ We use huggingface-based environment such as datasets, trainer, etc.
88
  ### Training hyperparameters
89
 
90
  The following hyperparameters were used during training:
91
- - learning_rate: 1e-05
92
- - train_batch_size: 1
93
  - eval_batch_size: 1
94
  - seed: 42
95
  - gradient_accumulation_steps: 64
@@ -97,7 +110,7 @@ The following hyperparameters were used during training:
97
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
98
  - lr_scheduler_type: linear
99
  - lr_scheduler_warmup_steps: 1586
100
- - num_epochs: 5
101
 
102
 
103
  ### Framework versions
 
69
  Paper Summarization
70
 
71
  ## Compare to Baseline
72
+ - Pegasus-X-base **zero-shot** Performance:
73
  - ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-LSUM : 6.2269 | 0.7894 | 4.6905 | 5.4591
 
 
 
74
 
75
+ - **This model**
76
+
77
+
78
+ - R-1 | R-2 | R-L | R-LSUM : 43.2305 | 16.6571 | 24.4315 | 33.9399 at
79
+ ```(python)
80
+ model.generate(input_ids =inputs["input_ids"].to(device),
81
+ attention_mask=inputs["attention_mask"].to(device),
82
+ length_penalty=1, num_beams=2, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, top_k=25,top_p=0.95)
83
+
84
+ ```
85
+ - R-1 | R-2 | R-L | R-LSUM : 40.8486 | 16.3717 | 25.2937 | 33.6923 at
86
+ ```(python)
87
+ model.generate(input_ids =inputs["input_ids"].to(device),
88
+ attention_mask=inputs["attention_mask"].to(device),
89
+ length_penalty=1, num_beams=1, max_length=128*2,top_p=1)
90
+ ```
91
+
92
 
93
  ## Training and evaluation data
94
 
 
102
  ### Training hyperparameters
103
 
104
  The following hyperparameters were used during training:
105
+ ```learning_rate: 1e-05,train_batch_size: 1
 
106
  - eval_batch_size: 1
107
  - seed: 42
108
  - gradient_accumulation_steps: 64
 
110
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
111
  - lr_scheduler_type: linear
112
  - lr_scheduler_warmup_steps: 1586
113
+ - num_epochs: 5```
114
 
115
 
116
  ### Framework versions