Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published 6 days ago • 35
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 5 days ago • 42
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published 4 days ago • 46
Running on CPU Upgrade 11.9k 🏆 Open LLM Leaderboard 2 Track, rank and evaluate open LLMs and chatbots
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published 15 days ago • 30
Vilawqa evaluation datasets Collection Datasets for evaluating vilaw model • 5 items • Updated 23 days ago
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22 • 88
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning Paper • 2410.06456 • Published Oct 9 • 35
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment Paper • 2410.08193 • Published Oct 10 • 3