Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper • 2412.16145 • Published 14 days ago • 36
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published 28 days ago • 123
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published 29 days ago • 58
Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension Paper • 2412.03704 • Published about 1 month ago • 6
SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance Paper • 2412.02687 • Published Dec 3, 2024 • 108
VoCo-LLaMA: Towards Vision Compression with Large Language Models Paper • 2406.12275 • Published Jun 18, 2024 • 29