Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
kirch
's Collections
Scotch & SOTA 🥃 Pt. 1: Big Boi LLM 🚛
Scotch & SOTA 🥃 Pt. 2: Quantized Small Boi LLM 👉👈
Scotch & SOTA 🥃 Pt. 3: Image Sorcery 🔮
Scotch & SOTA 🥃 Pt. 4: Pre-Training Datasets 📜
Scotch & SOTA 🥃 Pt. 5: Instruction Tuning Datasets 👩🏫
Scotch & SOTA 🥃 Pt. 6: Dialogue Tuning Datasets 💬
Scotch & SOTA 🥃 Pt. 7: Human Feedback Datasets 🫣
Scotch & SOTA 🥃 Pt. 4: Multi-Modal 🔀
Scotch & SOTA 🥃 Pt. 4: Pre-Training Datasets 📜
updated
Sep 25, 2023
We gotta start somewhere, these jsonl's aren't gonna train themselves.
Upvote
-
allenai/dolma
Updated
Apr 17
•
1.48k
•
856
allenai/peS2o
Updated
Oct 13
•
3.22k
•
160
tiiuae/falcon-refinedweb
Viewer
•
Updated
Jun 20, 2023
•
968M
•
39.5k
•
823
CarperAI/pilev2-dev
Preview
•
Updated
Mar 13, 2023
•
36
•
24
AlgorithmicResearchGroup/arxiv_cplusplus_research_code
Viewer
•
Updated
Sep 4
•
1.63M
•
690
•
5
bigcode/the-stack
Viewer
•
Updated
Apr 13, 2023
•
546M
•
5.71k
•
748
bigcode/starcoderdata
Viewer
•
Updated
May 16, 2023
•
207M
•
3.74k
•
404
cerebras/SlimPajama-627B
Preview
•
Updated
Jul 7, 2023
•
43.6k
•
436
euirim/goodwiki
Viewer
•
Updated
Sep 11, 2023
•
44.8k
•
103
•
52
nampdn-ai/tiny-textbooks
Viewer
•
Updated
Jul 3
•
420k
•
138
•
148
nampdn-ai/tiny-codes
Viewer
•
Updated
Sep 30, 2023
•
1.63M
•
323
•
233
roneneldan/TinyStories
Viewer
•
Updated
Aug 12
•
2.14M
•
18k
•
580
nampdn-ai/tiny-bridgedict
Viewer
•
Updated
Aug 4, 2023
•
17.6k
•
35
•
17
nampdn-ai/tiny-webtext
Viewer
•
Updated
Aug 27, 2023
•
2.32M
•
77
•
33
Upvote
-
Share collection
View history
Collection guide
Browse collections