sam-mosaic
commited on
Commit
•
01548f3
1
Parent(s):
e7119f3
add training configuration section
Browse files
README.md
CHANGED
@@ -152,6 +152,11 @@ For more details on the pretraining process, see [MPT-7B](https://huggingface.co
|
|
152 |
|
153 |
The data was tokenized using the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.
|
154 |
|
|
|
|
|
|
|
|
|
|
|
155 |
## Limitations and Biases
|
156 |
|
157 |
_The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_
|
|
|
152 |
|
153 |
The data was tokenized using the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.
|
154 |
|
155 |
+
### Training Configuration
|
156 |
+
|
157 |
+
This model was trained on 8 A100-40GBs for about 2.3 hours using the [MosaicML Platform](https://www.mosaicml.com/platform).
|
158 |
+
The model was trained with sharded data parallelism using [FSDP](https://pytorch.org/docs/stable/fsdp.html) and used the AdamW optimizer.
|
159 |
+
|
160 |
## Limitations and Biases
|
161 |
|
162 |
_The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_
|