Collections
Discover the best community collections!
Collections including paper arxiv:2311.04901
-
Random Field Augmentations for Self-Supervised Representation Learning
Paper • 2311.03629 • Published • 6 -
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models
Paper • 2311.04589 • Published • 18 -
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Paper • 2311.04901 • Published • 7 -
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Paper • 2311.06783 • Published • 26
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 14 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 25 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 6 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
-
Eureka: Human-Level Reward Design via Coding Large Language Models
Paper • 2310.12931 • Published • 26 -
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Paper • 2311.04901 • Published • 7 -
Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems
Paper • 2311.05884 • Published • 5 -
PolyMaX: General Dense Prediction with Mask Transformer
Paper • 2311.05770 • Published • 6