Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published 9 days ago • 63
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation Paper • 2409.12576 • Published 8 days ago • 14
Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation Paper • 2408.14819 • Published Aug 27 • 19
Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published Jun 7 • 54