OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints Paper • 2501.03841 • Published 6 days ago • 36
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published 3 days ago • 30
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper • 2501.04003 • Published 6 days ago • 21
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published 6 days ago • 45