Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ DCLM-1B is a 1.4 billion parameter language model trained on the DCLM-Baseline d
|
|
14 |
## Model Details
|
15 |
|
16 |
| Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
|
17 |
-
|
18 |
| 1.4B | 2.6T | 24 | 2048 | 16 | 2048 |
|
19 |
|
20 |
|
@@ -121,7 +121,7 @@ We evaluate the models using llm-foundry.
|
|
121 |
|
122 |
|
123 |
| Task | Core | Extended | MMLU 5-shot |
|
124 |
-
|
125 |
| DCLM-1B | 42.3 | 25.1 | 41.9 |
|
126 |
| SmolLM | 36.3 | 21.2 | 30.0 |
|
127 |
|
|
|
14 |
## Model Details
|
15 |
|
16 |
| Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
|
17 |
+
|:------:|:-----------------:|:--------:|:-------------:|:-----------------:|:----------------:|
|
18 |
| 1.4B | 2.6T | 24 | 2048 | 16 | 2048 |
|
19 |
|
20 |
|
|
|
121 |
|
122 |
|
123 |
| Task | Core | Extended | MMLU 5-shot |
|
124 |
+
|:---------:|:------:|:----------:|:-------------:|
|
125 |
| DCLM-1B | 42.3 | 25.1 | 41.9 |
|
126 |
| SmolLM | 36.3 | 21.2 | 30.0 |
|
127 |
|