Simulating Classroom Education with LLM-Empowered Agents Paper • 2406.19226 • Published 4 days ago • 27
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding Paper • 2406.19389 • Published 4 days ago • 48
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data Paper • 2406.18790 • Published 5 days ago • 30
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs Paper • 2406.18629 • Published 5 days ago • 34
Make It Count: Text-to-Image Generation with an Accurate Number of Objects Paper • 2406.10210 • Published 17 days ago • 74