FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published 4 days ago • 12
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 3 days ago • 86
Whisper-GPT: A Hybrid Representation Audio Large Language Model Paper • 2412.11449 • Published 5 days ago • 4
TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning Paper • 2412.10447 • Published 10 days ago • 5
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6 • 47
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs Paper • 2412.08347 • Published 10 days ago • 4
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Paper • 2412.09626 • Published 9 days ago • 19
Gradio WebRTC Cookbook ⚡️ Collection Collection of real-time voice and video demos built with gradio-webrtc custom component • 8 items • Updated 11 days ago • 8
Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement Paper • 2412.04003 • Published 16 days ago • 9
SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance Paper • 2412.02687 • Published 18 days ago • 109
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published 15 days ago • 112
view article Article Power steering: Squeeze massive power from small LLMs By ucheog • 12 days ago • 4