LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 7 days ago • 91
Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese Paper • 2408.12480 • Published Aug 22 • 17
InternVL 2.0 Collection Expanding Performance Boundaries of Open-Source MLLM • 17 items • Updated about 15 hours ago • 79