GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI Paper • 2411.14522 • Published 5 days ago • 7
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models Paper • 2411.14982 • Published 4 days ago • 13
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper • 2411.14405 • Published 5 days ago • 47
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published 11 days ago • 57
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models Paper • 2411.14257 • Published 5 days ago • 8
Stable Flow: Vital Layers for Training-Free Image Editing Paper • 2411.14430 • Published 5 days ago • 11
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models Paper • 2411.14432 • Published 5 days ago • 18
Multimodal Autoregressive Pre-training of Large Vision Encoders Paper • 2411.14402 • Published 5 days ago • 36
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper • 2411.13476 • Published 6 days ago • 13
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization Paper • 2411.11909 • Published 9 days ago • 20
Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering Paper • 2411.09213 • Published 12 days ago • 6
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published 10 days ago • 39
MARS: Unleashing the Power of Variance Reduction for Training Large Models Paper • 2411.10438 • Published 11 days ago • 11
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 11 days ago • 99
Direct Preference Optimization Using Sparse Feature-Level Constraints Paper • 2411.07618 • Published 14 days ago • 15