vlm data - a poonyZ Collection

poonyZ 's Collections

omni

T2I

agi

fancy

VLM

llm

vlm data

updated Jan 7

MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation

Paper • 2412.07147 • Published Dec 10, 2024 • 5

Note 值得注意
Grounding Descriptions in Images informs Zero-Shot Visual Recognition

Paper • 2412.04429 • Published Dec 5, 2024

Note 一般
Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models

Paper • 2412.05939 • Published Dec 8, 2024 • 16

Note 值得注意
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions

Paper • 2412.08737 • Published Dec 11, 2024 • 53

Note 值得注意
VisionArena: 230K Real World User-VLM Conversations with Preference Labels

Paper • 2412.08687 • Published Dec 11, 2024 • 13
BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities

Paper • 2412.07769 • Published Dec 10, 2024 • 26

Note 一般
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 51

Note 值得关注
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Paper • 2412.14475 • Published Dec 19, 2024 • 53
Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage

Paper • 2412.15484 • Published Dec 20, 2024 • 15

Note 值得关注