Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper • 2411.14405 • Published about 13 hours ago • 11 • 1
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents Paper • 2411.06559 • Published 12 days ago • 10 • 2
AnimateAnything: Consistent and Controllable Animation for Video Generation Paper • 2411.10836 • Published 6 days ago • 18 • 2
Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of Experts Paper • 2411.10669 • Published 6 days ago • 9 • 2
SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers Paper • 2411.10510 • Published 7 days ago • 8 • 2
FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on Paper • 2411.10499 • Published 7 days ago • 9 • 2
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement Paper • 2411.06558 • Published 12 days ago • 29 • 6
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use Paper • 2411.10323 • Published 7 days ago • 26 • 2
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 7 days ago • 89 • 6
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 7 days ago • 89 • 6