Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Paper • 2404.19752 • Published Apr 30 • 20
Customizing Text-to-Image Models with a Single Image Pair Paper • 2405.01536 • Published May 2 • 17
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 106
Taming Latent Diffusion Model for Neural Radiance Field Inpainting Paper • 2404.09995 • Published Apr 15 • 6
Scaling Instructable Agents Across Many Simulated Worlds Paper • 2404.10179 • Published Mar 13 • 23
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation Paper • 2403.16990 • Published Mar 25 • 24
FlashFace: Human Image Personalization with High-fidelity Identity Preservation Paper • 2403.17008 • Published Mar 25 • 18
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models Paper • 2403.13535 • Published Mar 20 • 20
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20 • 58
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation Paper • 2403.12015 • Published Mar 18 • 60
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15 • 56
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization Paper • 2403.00483 • Published Mar 1 • 9
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks Paper • 2403.00522 • Published Mar 1 • 40
Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All Paper • 2401.13795 • Published Jan 24 • 64
Deconstructing Denoising Diffusion Models for Self-Supervised Learning Paper • 2401.14404 • Published Jan 25 • 16
Generative Multimodal Models are In-Context Learners Paper • 2312.13286 • Published Dec 20, 2023 • 32