Solving math word problems with process- and outcome-based feedback Paper • 2211.14275 • Published Nov 25, 2022 • 8
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 9 days ago • 91
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 50
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12, 2024 • 64
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only Paper • 2306.01116 • Published Jun 1, 2023 • 32
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models Paper • 2411.14982 • Published Nov 22, 2024 • 16
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution Paper • 2409.12961 • Published Sep 19, 2024 • 25
Oryx-1.5 Collection Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution • 2 items • Updated Oct 23, 2024 • 3
Oryx Collection Oryx: One Multi-Modal LLM for On-Demand Spatial-Temporal Understanding • 6 items • Updated about 1 month ago • 15
LongVA Collection Long Context Transfer From Text To Vision: https://lmms-lab.github.io/posts/longva/ • 5 items • Updated Oct 4, 2024 • 13
Quantifying the Carbon Emissions of Machine Learning Paper • 1910.09700 • Published Oct 21, 2019 • 13
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper • 2410.13754 • Published Oct 17, 2024 • 75
LLaVA-Critic: Learning to Evaluate Multimodal Models Paper • 2410.02712 • Published Oct 3, 2024 • 35
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 554
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI Paper • 2311.16502 • Published Nov 27, 2023 • 35