InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Paper • 2407.03320 • Published Jul 3 • 93
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration Paper • 2406.01014 • Published Jun 3 • 31
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models Paper • 2309.15103 • Published Sep 26, 2023 • 42