Update README.md
Browse files
README.md
CHANGED
@@ -62,7 +62,7 @@ Apache-2.0 (commercial use permitted)
|
|
62 |
|
63 |
## Documentation
|
64 |
|
65 |
-
* [Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](www.mosaicml.com/blog/mpt-7b)
|
66 |
* [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
|
67 |
* Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
|
68 |
|
@@ -140,14 +140,14 @@ The model was trained for 1T tokens (with batch size 1760 and sequence length 20
|
|
140 |
| Data Source | Number of Tokens in Source | Proportion | Effective Number of Tokens | Epochs |
|
141 |
|-------------|----------------------------|------------|----------------------------|--------|
|
142 |
| mC4 3.1.0 - English | 417.99 B | 0.33 | 330 B | 0.14 |
|
143 |
-
| C4 - English - SemDedup 80% | 100.42 B | 0.
|
144 |
| RedPajama - CommonCrawl | 878.45 B | 0.1 | 100 B | 0.11 |
|
145 |
| The Stack - Selected Languages | 463.78 B | 0.1 | 100 B | 0.22 |
|
146 |
-
| RedPajama - Wikipedia |
|
147 |
| The Stack - Markdown | 107.07 B | 0.035 | 35 B | 0.33 |
|
148 |
-
| S2ORC | 48.85 B | 0.
|
149 |
-
| RedPajama - Books | 26.02 B | 0.
|
150 |
-
| RedPajama - arXiv | 28.10 B | 0.019 | 19 B | 0.
|
151 |
| RedPajama - StackExchange | 20.54 B | 0.014 | 14 B |0.68 |
|
152 |
|
153 |
Samples for each batch were selected from one of the datasets with the probability specified above.
|
|
|
62 |
|
63 |
## Documentation
|
64 |
|
65 |
+
* [Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](https://www.mosaicml.com/blog/mpt-7b)
|
66 |
* [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
|
67 |
* Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
|
68 |
|
|
|
140 |
| Data Source | Number of Tokens in Source | Proportion | Effective Number of Tokens | Epochs |
|
141 |
|-------------|----------------------------|------------|----------------------------|--------|
|
142 |
| mC4 3.1.0 - English | 417.99 B | 0.33 | 330 B | 0.14 |
|
143 |
+
| C4 - English - SemDedup 80% | 100.42 B | 0.294 | 294 B | 2.93 |
|
144 |
| RedPajama - CommonCrawl | 878.45 B | 0.1 | 100 B | 0.11 |
|
145 |
| The Stack - Selected Languages | 463.78 B | 0.1 | 100 B | 0.22 |
|
146 |
+
| RedPajama - Wikipedia - En | 4.87 B | 0.04 | 40 B | 8.21 |
|
147 |
| The Stack - Markdown | 107.07 B | 0.035 | 35 B | 0.33 |
|
148 |
+
| S2ORC | 48.85 B | 0.035 | 35 B | 0.72 |
|
149 |
+
| RedPajama - Books | 26.02 B | 0.033 | 33B | 1.27 |
|
150 |
+
| RedPajama - arXiv | 28.10 B | 0.019 | 19 B | 0.68 |
|
151 |
| RedPajama - StackExchange | 20.54 B | 0.014 | 14 B |0.68 |
|
152 |
|
153 |
Samples for each batch were selected from one of the datasets with the probability specified above.
|