Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published 5 days ago • 33
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21 • 55