VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper • 2502.02492 • Published 2 days ago • 32
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published 3 days ago • 104
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 9 days ago • 100 • 6
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 15 days ago • 79 • 3
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 21 days ago • 67 • 4
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 23 days ago • 272 • 6
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 23 days ago • 272 • 6
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints Paper • 2501.03841 • Published 30 days ago • 53 • 3
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints Paper • 2501.03841 • Published 30 days ago • 53
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains Paper • 2501.05707 • Published 27 days ago • 19
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 28 days ago • 87 • 5