Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
vilm
's Collections
Quyen
Smol Pretraining
VinaLLaMA
Vietcuna
Mixsmol
Smol Pretraining
updated
Feb 9
Curated & High quality Synthetic Textbook Datasets for Pretraining
Upvote
2
vilm/code-textbooks
Viewer
•
Updated
Jan 20
•
207k
•
35
•
2
vilm/MathPile-arXiv
Viewer
•
Updated
Jan 22
•
340k
•
33
•
2
vilm/MathPile-StackExchange
Viewer
•
Updated
Jan 22
•
264k
•
36
•
1
vilm/MathPile-ProofWiki
Viewer
•
Updated
Jan 22
•
23.6k
•
38
vilm/MathPile-Textbooks
Viewer
•
Updated
Jan 22
•
784
•
33
vilm/MathPile-Wikipedia
Viewer
•
Updated
Jan 22
•
20.9k
•
33
•
1
vilm/RedPajama-v2-small
Viewer
•
Updated
Jan 20
•
500k
•
47
•
1
vilm/RedPajama-v2-xsmall
Viewer
•
Updated
Jan 20
•
250k
•
42
•
1
vilm/the-stack-smol-xl-cleaned
Viewer
•
Updated
Jan 20
•
205k
•
39
•
1
vilm/refinedweb-1m-medium
Viewer
•
Updated
Jan 20
•
1M
•
127
•
2
Upvote
2
Share collection
View history
Collection guide
Browse collections