Spaces:
Sleeping
Sleeping
✏️
Browse filesSigned-off-by: peter szemraj <peterszemraj@gmail.com>
README.md
CHANGED
@@ -28,7 +28,7 @@ In our country, we say _"To let 100M parameters model generate python script and
|
|
28 |
|
29 |
## Base Model Information
|
30 |
|
31 |
-
The base model, smol_llama-101M-GQA,
|
32 |
|
33 |
- [JeanKaddour/minipile](https://huggingface.co/datasets/JeanKaddour/minipile)
|
34 |
- [pszemraj/simple_wikipedia_LM](https://huggingface.co/datasets/pszemraj/simple_wikipedia_LM)
|
|
|
28 |
|
29 |
## Base Model Information
|
30 |
|
31 |
+
The base model, smol_llama-101M-GQA, has been pre-trained on a relatively small number of high quality tokens (less than ~20B). It has impressive performance despite its compact size of 101M parameters. Training data for this base model included:
|
32 |
|
33 |
- [JeanKaddour/minipile](https://huggingface.co/datasets/JeanKaddour/minipile)
|
34 |
- [pszemraj/simple_wikipedia_LM](https://huggingface.co/datasets/pszemraj/simple_wikipedia_LM)
|