Collections
Discover the best community collections!
Collections including paper arxiv:2502.01720
-
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
Paper β’ 2412.09622 β’ Published β’ 8 -
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models
Paper β’ 2412.04146 β’ Published β’ 22 -
Learning Flow Fields in Attention for Controllable Person Image Generation
Paper β’ 2412.08486 β’ Published β’ 33 -
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation
Paper β’ 2412.05148 β’ Published β’ 11
-
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Paper β’ 2410.10306 β’ Published β’ 54 -
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper β’ 2411.05003 β’ Published β’ 70 -
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
Paper β’ 2411.04709 β’ Published β’ 25 -
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Paper β’ 2410.07171 β’ Published β’ 42
-
pOps: Photo-Inspired Diffusion Operators
Paper β’ 2406.01300 β’ Published β’ 17 -
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
Paper β’ 2406.06911 β’ Published β’ 11 -
Interpreting the Weight Space of Customized Diffusion Models
Paper β’ 2406.09413 β’ Published β’ 19 -
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Paper β’ 2406.09162 β’ Published β’ 13
-
MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels
Paper β’ 2405.07526 β’ Published β’ 19 -
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach
Paper β’ 2405.15613 β’ Published β’ 15 -
A Touch, Vision, and Language Dataset for Multimodal Alignment
Paper β’ 2402.13232 β’ Published β’ 15 -
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper β’ 2406.11813 β’ Published β’ 31