LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model Paper • 2404.01331 • Published Mar 29, 2024 • 25
BRAVE: Broadening the visual encoding of vision-language models Paper • 2404.07204 • Published Apr 10, 2024 • 18