InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning Paper • 2409.12568 • Published 10 days ago • 44
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published 9 days ago • 119
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published May 20 • 45
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Paper • 2404.16821 • Published Apr 25 • 53
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 10 days ago • 467
TextSquare: Scaling up Text-Centric Visual Instruction Tuning Paper • 2404.12803 • Published Apr 19 • 29
BRAVE: Broadening the visual encoding of vision-language models Paper • 2404.07204 • Published Apr 10 • 15
MoAI: Mixture of All Intelligence for Large Language and Vision Models Paper • 2403.07508 • Published Mar 12 • 75