joelniklaus
/

legal-english-longformer-base

kapllan commited on Jul 28, 2023

Commit

dae0aa3

1 Parent(s): 9a969c0

The learning rate was not displayed as it should.

Files changed (1) hide show

README.md CHANGED Viewed

@@ -80,7 +80,7 @@ For further details see [Niklaus et al. 2023](https://arxiv.org/abs/2306.02069?u
 - batche size: 512 samples
 - Number of steps: 1M/500K for the base/large model
 - Warm-up steps for the first 5\% of the total training steps
-- Learning rate: (linearly increasing up to) $1e\!-\!4$
 - Word masking: increased 20/30\% masking rate for base/large models respectively
 ## Evaluation

 - batche size: 512 samples
 - Number of steps: 1M/500K for the base/large model
 - Warm-up steps for the first 5\% of the total training steps
+- Learning rate: (linearly increasing up to) 1e-4
 - Word masking: increased 20/30\% masking rate for base/large models respectively
 ## Evaluation