UNIST-Eunchan
commited on
Commit
•
333f437
1
Parent(s):
510a7fc
Update README.md
Browse files
README.md
CHANGED
@@ -92,7 +92,7 @@ Paper Summarization
|
|
92 |
```(python)
|
93 |
model.generate(input_ids =inputs["input_ids"].to(device),
|
94 |
attention_mask=inputs["attention_mask"].to(device),
|
95 |
-
num_beam_groups=
|
96 |
```
|
97 |
|
98 |
|
@@ -109,15 +109,19 @@ We use huggingface-based environment such as datasets, trainer, etc.
|
|
109 |
### Training hyperparameters
|
110 |
|
111 |
The following hyperparameters were used during training:
|
112 |
-
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
-
|
119 |
-
|
120 |
-
|
|
|
|
|
|
|
|
|
121 |
|
122 |
|
123 |
### Framework versions
|
|
|
92 |
```(python)
|
93 |
model.generate(input_ids =inputs["input_ids"].to(device),
|
94 |
attention_mask=inputs["attention_mask"].to(device),
|
95 |
+
num_beam_groups=5,diversity_penalty=1.0,num_beams=5,min_length=150,max_length=128*4)
|
96 |
```
|
97 |
|
98 |
|
|
|
109 |
### Training hyperparameters
|
110 |
|
111 |
The following hyperparameters were used during training:
|
112 |
+
|
113 |
+
```(python)
|
114 |
+
learning_rate: 1e-05,
|
115 |
+
train_batch_size: 1,
|
116 |
+
eval_batch_size: 1,
|
117 |
+
seed: 42,
|
118 |
+
gradient_accumulation_steps: 64,
|
119 |
+
total_train_batch_size: 64,
|
120 |
+
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08,
|
121 |
+
lr_scheduler_type: linear,
|
122 |
+
lr_scheduler_warmup_steps: 1586,
|
123 |
+
num_epochs: 5
|
124 |
+
```
|
125 |
|
126 |
|
127 |
### Framework versions
|