umarbutler commited on
Commit
01dc13a
1 Parent(s): b19ae6f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -82,7 +82,7 @@ The model was trained with the following hyperparameters for the first 100,290 s
82
  | Weight decay | 0.01 |
83
  | Warmup ratio | 0.06 |
84
 
85
- After training on two RTX A6000s for \~120,050 steps over a period of 91 hours, the [vast.ai](https://vast.ai) instance hosting the model crashed. Fortunately, a checkpoint had been saved at step 100,290 (~60% of an epoch), although the optimiser's state was mistakenly not downloaded. The model was subsequently moved to a new instance where it was trained on an L40 for a further 133,711 steps (~40% of an epoch) with the following hyperparameters (changes emphasised):
86
  | Hyperparameter | Value |
87
  | --- | --- |
88
  | Sequence length | 512 |
 
82
  | Weight decay | 0.01 |
83
  | Warmup ratio | 0.06 |
84
 
85
+ After training on two RTX A6000s for \~120,050 steps over a period of 91 hours, the [vast.ai](https://vast.ai) instance hosting the model crashed. Fortunately, a checkpoint had been saved at step 100,290 (\~60% of an epoch), although the optimiser's state was mistakenly not downloaded. The model was subsequently moved to a new instance where it was trained on an L40 for a further 133,711 steps (\~40% of an epoch) with the following hyperparameters (changes emphasised):
86
  | Hyperparameter | Value |
87
  | --- | --- |
88
  | Sequence length | 512 |