FAST: Efficient Action Tokenization for Vision-Language-Action Models Paper • 2501.09747 • Published 28 days ago • 23
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training Paper • 2501.06842 • Published Jan 12 • 15
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published Jan 7 • 49
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published Dec 30, 2024 • 35
iFormer: Integrating ConvNet and Transformer for Mobile Application Paper • 2501.15369 • Published 19 days ago • 12
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models Paper • 2501.12370 • Published 23 days ago • 10
Return of the Encoder: Maximizing Parameter Efficiency for SLMs Paper • 2501.16273 • Published 17 days ago • 5