UNIST-Eunchan
/

Research-Paper-Summarization-Pegasus-x-ArXiv

text2text-generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

UNIST-Eunchan commited on Nov 10, 2023

Commit

e685f90

•

1 Parent(s): 57c555c

Update README.md

Files changed (1) hide show

README.md +20 -7

README.md CHANGED Viewed

@@ -69,12 +69,26 @@ More information needed
 Paper Summarization
 ## Compare to Baseline
-- **Pegasus-X-base Zero-shot Performance:**
   - ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-LSUM : 6.2269 | 0.7894 | 4.6905 | 5.4591
-- **Our Model (Generated with length_penalty=1, num_beams=2, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, top_k=25,top_p=0.95**)
-  - ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-LSUM : 43.2305 | 16.6571 | 24.4315 | 33.9399
 ## Training and evaluation data
@@ -88,8 +102,7 @@ We use huggingface-based environment such as datasets, trainer, etc.
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 1
 - eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 64
@@ -97,7 +110,7 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1586
-- num_epochs: 5
 ### Framework versions

 Paper Summarization
 ## Compare to Baseline
+- Pegasus-X-base **zero-shot** Performance:
   - ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-LSUM : 6.2269 | 0.7894 | 4.6905 | 5.4591
+- **This model**
+  - R-1 | R-2 | R-L | R-LSUM : 43.2305 | 16.6571 | 24.4315 | 33.9399 at
+  ```(python)
+  model.generate(input_ids =inputs["input_ids"].to(device),
+                              attention_mask=inputs["attention_mask"].to(device),
+                              length_penalty=1, num_beams=2, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, top_k=25,top_p=0.95)
+  ```
+  - R-1 | R-2 | R-L | R-LSUM : 40.8486 | 16.3717 | 25.2937 | 33.6923 at
+  ```(python)
+  model.generate(input_ids =inputs["input_ids"].to(device),
+                              attention_mask=inputs["attention_mask"].to(device),
+                              length_penalty=1, num_beams=1, max_length=128*2,top_p=1)
+  ```
 ## Training and evaluation data
 ### Training hyperparameters
 The following hyperparameters were used during training:
+```learning_rate: 1e-05,train_batch_size: 1
 - eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1586
+- num_epochs: 5```
 ### Framework versions