Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models Mar 20 • 30
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published 1 day ago • 40
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper • 2405.18392 • Published 29 days ago • 12
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 64 items • Updated 15 days ago • 69
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community • 17 items • Updated 20 days ago • 203