HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published 11 days ago • 85
MotiF: Making Text Count in Image Animation with Motion Focal Loss Paper • 2412.16153 • Published 16 days ago • 6
In Case You Missed It: ARC 'Challenge' Is Not That Challenging Paper • 2412.17758 • Published 13 days ago • 16
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation Paper • 2412.18597 • Published 12 days ago • 19
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published 13 days ago • 37 • 23
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published 13 days ago • 37
3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding Paper • 2412.18450 • Published 12 days ago • 32
Sequence Matters: Harnessing Video Models in 3D Super-Resolution Paper • 2412.11525 • Published 20 days ago • 10
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities Paper • 2412.14123 • Published 18 days ago • 11
GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs Paper • 2412.11258 • Published 21 days ago • 13
Causal Diffusion Transformers for Generative Modeling Paper • 2412.12095 • Published 20 days ago • 23