RatnaKumar
commited on
Commit
•
6d41c99
1
Parent(s):
1c4513b
Changed distillation URL
Browse files
README.md
CHANGED
@@ -187,7 +187,7 @@ The details of the masking procedure for each sentence are the following:
|
|
187 |
### Pretraining
|
188 |
|
189 |
The model was trained on 8 16 GB V100 for 90 hours. See the
|
190 |
-
[training code](https://github.com/huggingface/transformers/tree/
|
191 |
details.
|
192 |
|
193 |
## Evaluation results
|
|
|
187 |
### Pretraining
|
188 |
|
189 |
The model was trained on 8 16 GB V100 for 90 hours. See the
|
190 |
+
[training code](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) for all hyperparameters
|
191 |
details.
|
192 |
|
193 |
## Evaluation results
|