ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 3 days ago • 74
Wan2.1 14B 480p I2V LoRAs Collection A collection of Remade's Wan2.1 14B 480p I2V LoRAs • 24 items • Updated 3 days ago • 55
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published 6 days ago • 56
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models Paper • 2503.05638 • Published 10 days ago • 17
VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control Paper • 2503.05639 • Published 10 days ago • 21
MeshPad: Interactive Sketch Conditioned Artistic-designed Mesh Generation and Editing Paper • 2503.01425 • Published 14 days ago • 14
DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion Paper • 2503.01183 • Published 14 days ago • 26
Tell me why: Visual foundation models as self-explainable classifiers Paper • 2502.19577 • Published 19 days ago • 10
Mobius: Text to Seamless Looping Video Generation via Latent Shift Paper • 2502.20307 • Published 18 days ago • 17
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute Paper • 2502.20126 • Published 18 days ago • 20
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published 20 days ago • 53
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper • 2502.18364 • Published 20 days ago • 34
KV-Edit: Training-Free Image Editing for Precise Background Preservation Paper • 2502.17363 • Published 21 days ago • 33
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper • 2502.14397 • Published 25 days ago • 38
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 25 days ago • 130