Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper โข 2502.05171 โข Published Feb 7 โข 122
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper โข 2502.02492 โข Published Feb 4 โข 61
Unifying Specialized Visual Encoders for Video Language Models Paper โข 2501.01426 โข Published Jan 2 โข 21
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Paper โข 2407.01370 โข Published Jul 1, 2024 โข 88
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild Paper โข 2305.11147 โข Published May 18, 2023 โข 3