LLM - a xxyyy123 Collection

xxyyy123 's Collections

LLM

Align

Dataset

LLM

updated Sep 25

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Paper • 2407.21770 • Published Jul 31 • 22
VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24 • 39
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

Paper • 2407.08583 • Published Jul 11 • 10
Vision language models are blind

Paper • 2407.06581 • Published Jul 9 • 82
ColPali: Efficient Document Retrieval with Vision Language Models

Paper • 2407.01449 • Published Jun 27 • 42
Multimodal Structured Generation: CVPR's 2nd MMFM Challenge Technical Report

Paper • 2406.11403 • Published Jun 17 • 4
AIDC-AI/Ovis1.6-Gemma2-9B

Image-Text-to-Text • Updated 24 days ago • 5.87k • 259