Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training Paper • 2502.06589 • Published 7 days ago • 16
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning Paper • 2502.06060 • Published 8 days ago • 29
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 7 days ago • 125
CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference Paper • 2502.04416 • Published 11 days ago • 10
CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance Paper • 2502.04350 • Published 13 days ago • 10
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control 14 days ago • 99
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 5 days ago • 90
view article Article Accelerating Stable Diffusion XL Inference with JAX on Cloud TPU v5e Oct 3, 2023 • 8
view article Article Honesty, Open Source, and the Future of AI in Art: An Open Question By Duskfallcrew • 20 days ago • 4
view article Article Is Attention Interpretable in Transformer-Based Large Language Models? Let’s Unpack the Hype By royswastik • 20 days ago • 4
view article Article FuseO1-Preview: System-II Reasoning Fusion of LLMs By Wanfq and 4 others • 28 days ago • 14
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 26 days ago • 319
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 22 days ago • 99
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 • 691
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper • 2501.13926 • Published 25 days ago • 36