Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning Paper • 2311.10709 • Published Nov 17, 2023 • 24
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Paper • 2405.12970 • Published May 21 • 22
FIFO-Diffusion: Generating Infinite Videos from Text without Training Paper • 2405.11473 • Published May 19 • 53
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis Paper • 2405.14224 • Published May 23 • 12
Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching Paper • 2405.11252 • Published May 18 • 12
OutfitAnyone: Ultra-high Quality Virtual Try-On for Any Clothing and Any Person Paper • 2407.16224 • Published Jul 23 • 24
DDK: Distilling Domain Knowledge for Efficient Large Language Models Paper • 2407.16154 • Published Jul 23 • 20
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency Paper • 2407.17470 • Published Jul 24 • 14
DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized Deepfake Detection Paper • 2406.00856 • Published Jun 2 • 9
Tora: Trajectory-oriented Diffusion Transformer for Video Generation Paper • 2407.21705 • Published Jul 31 • 25
VidGen-1M: A Large-Scale Dataset for Text-to-video Generation Paper • 2408.02629 • Published Aug 5 • 13
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Paper • 2408.08872 • Published Aug 16 • 97
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations Paper • 2408.08459 • Published Aug 15 • 44
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning Paper • 2408.11001 • Published Aug 20 • 11
TraDiffusion: Trajectory-Based Training-Free Image Generation Paper • 2408.09739 • Published Aug 19 • 7