Update README.md
Browse files
README.md
CHANGED
@@ -17,9 +17,6 @@ This model is a pre-trained BERT-Base trained in two phases on the [Graphcore/wi
|
|
17 |
|
18 |
Pre-trained BERT Base model trained on Wikipedia data.
|
19 |
|
20 |
-
## Intended uses & limitations
|
21 |
-
|
22 |
-
More information needed
|
23 |
|
24 |
## Training and evaluation data
|
25 |
|
@@ -31,7 +28,7 @@ Trained on wikipedia datasets:
|
|
31 |
## Training procedure
|
32 |
|
33 |
Trained MLM and NSP pre-training scheme from [Large Batch Optimization for Deep Learning: Training BERT in 76 minutes](https://arxiv.org/abs/1904.00962).
|
34 |
-
Trained on 16 Graphcore Mk2 IPUs.
|
35 |
|
36 |
Command lines:
|
37 |
|
|
|
17 |
|
18 |
Pre-trained BERT Base model trained on Wikipedia data.
|
19 |
|
|
|
|
|
|
|
20 |
|
21 |
## Training and evaluation data
|
22 |
|
|
|
28 |
## Training procedure
|
29 |
|
30 |
Trained MLM and NSP pre-training scheme from [Large Batch Optimization for Deep Learning: Training BERT in 76 minutes](https://arxiv.org/abs/1904.00962).
|
31 |
+
Trained on 16 Graphcore Mk2 IPUs using [`optimum-graphcore`](https://github.com/huggingface/optimum-graphcore)
|
32 |
|
33 |
Command lines:
|
34 |
|