jacobfulano commited on
Commit
c8eb665
1 Parent(s): 65996c1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -14
README.md CHANGED
@@ -18,6 +18,19 @@ March 2023
18
  * Blog post
19
  * Github (mosaicml/examples repo)
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ## Model description
22
 
23
  In order to build MosaicBERT, we adopted architectural choices from the recent transformer literature.
@@ -60,22 +73,8 @@ reduces the number of read/write operations between the GPU HBM (high bandwidth
60
  MosaicBERT-Base trains faster than BERT-Base despite having more parameters.
61
 
62
 
63
- # How to use
64
 
65
 
66
-
67
-
68
- ```python
69
- from transformers import AutoModelforForMaskedLM
70
- mlm = AutoModelForMaskedLM.from_pretrained('mosaicml/mosaic-bert-base', use_auth_token=<your token>, trust_remote_code=True)
71
- ```
72
- The tokenizer for this model is the Hugging Face `bert-base-uncased` tokenizer.
73
-
74
- ```python
75
- from transformers import BertTokenizer
76
- tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
77
- ```
78
-
79
  ## Training data
80
 
81
  MosaicBERT is pretrained using a standard Masked Language Modeling (MLM) objective: the model is given a sequence of
 
18
  * Blog post
19
  * Github (mosaicml/examples repo)
20
 
21
+ # How to use
22
+
23
+ ```python
24
+ from transformers import AutoModelforForMaskedLM
25
+ mlm = AutoModelForMaskedLM.from_pretrained('mosaicml/mosaic-bert-base', use_auth_token=<your token>, trust_remote_code=True)
26
+ ```
27
+ The tokenizer for this model is the Hugging Face `bert-base-uncased` tokenizer.
28
+
29
+ ```python
30
+ from transformers import BertTokenizer
31
+ tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
32
+ ```
33
+
34
  ## Model description
35
 
36
  In order to build MosaicBERT, we adopted architectural choices from the recent transformer literature.
 
73
  MosaicBERT-Base trains faster than BERT-Base despite having more parameters.
74
 
75
 
 
76
 
77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  ## Training data
79
 
80
  MosaicBERT is pretrained using a standard Masked Language Modeling (MLM) objective: the model is given a sequence of