view article Article LeRobot goes to driving school: World’s largest open-source self-driving dataset 1 day ago • 23
SYNTHETIC-1 Collection A collection of tasks & verifiers for reasoning datasets • 9 items • Updated 19 days ago • 49
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 22 days ago • 93
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub 28 days ago • 49
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • Jan 23 • 64
view article Article Explore, Curate and Vector Search Any Hugging Face Dataset with Nomic Atlas By MaxNomic and 4 others • Jan 23 • 30
high-quality Chinese training datasets Collection a suite of high-quality Chinese datasets, used for pretraining, fine-tuning or preference alignment. And the models trained on these datasets. • 13 items • Updated about 22 hours ago • 11
view article Article Finding Moroccan Arabic (Darija) in Fineweb 2 By omarkamali and 3 others • Dec 8, 2024 • 22