Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper ā¢ 2412.15322 ā¢ Published 13 days ago ā¢ 16
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models Paper ā¢ 2411.07126 ā¢ Published Nov 11, 2024 ā¢ 28
OpenCoder Collection OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. ā¢ 8 items ā¢ Updated Nov 23, 2024 ā¢ 78
PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance Paper ā¢ 2411.02327 ā¢ Published Nov 4, 2024 ā¢ 11
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper ā¢ 2410.10306 ā¢ Published Oct 14, 2024 ā¢ 54
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation Paper ā¢ 2409.18964 ā¢ Published Sep 27, 2024 ā¢ 25
Training Language Models to Self-Correct via Reinforcement Learning Paper ā¢ 2409.12917 ā¢ Published Sep 19, 2024 ā¢ 135
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Paper ā¢ 2409.02634 ā¢ Published Sep 4, 2024 ā¢ 90
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders Paper ā¢ 2408.15998 ā¢ Published Aug 28, 2024 ā¢ 84