R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning Paper • 2503.05379 • Published 6 days ago • 22
SkyReels-V1 Collection SkyReels V1 open models collections • 2 items • Updated 24 days ago • 18
Enhance-A-Video: Better Generated Video for Free Paper • 2502.07508 • Published 30 days ago • 21
Magic 1-For-1: Generating One Minute Video Clips within One Minute Paper • 2502.07701 • Published 29 days ago • 34
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper • 2502.02492 • Published Feb 4 • 61
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper • 2501.12326 • Published Jan 21 • 54
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 275
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Paper • 2409.02634 • Published Sep 4, 2024 • 94
Dolphins: Multimodal Language Model for Driving Paper • 2312.00438 • Published Dec 1, 2023 • 15
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 73
Make Pixels Dance: High-Dynamic Video Generation Paper • 2311.10982 • Published Nov 18, 2023 • 69
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models Paper • 2311.10093 • Published Nov 16, 2023 • 58
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities Paper • 2311.05698 • Published Nov 9, 2023 • 14
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models Paper • 2311.05997 • Published Nov 10, 2023 • 37
OtterHD: A High-Resolution Multi-modality Model Paper • 2311.04219 • Published Nov 7, 2023 • 33
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model Paper • 2309.16058 • Published Sep 27, 2023 • 55