Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published 21 days ago • 52
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published 21 days ago • 52
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Paper • 2412.07720 • Published 23 days ago • 30
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems Paper • 2402.14008 • Published Feb 21, 2024
GUICourse: From General Vision Language Models to Versatile GUI Agents Paper • 2406.11317 • Published Jun 17, 2024 • 1
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback Paper • 2309.10691 • Published Sep 19, 2023 • 4
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets Paper • 2309.17428 • Published Sep 29, 2023 • 1
Advancing LLM Reasoning Generalists with Preference Trees Paper • 2404.02078 • Published Apr 2, 2024 • 44
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness Paper • 2405.17220 • Published May 27, 2024 • 1
UltraMedical: Building Specialized Generalists in Biomedicine Paper • 2406.03949 • Published Jun 6, 2024