-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 15 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 8 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
Collections
Discover the best community collections!
Collections including paper arxiv:2411.18671
-
Differential Transformer
Paper • 2410.05258 • Published • 166 -
PaliGemma 2: A Family of Versatile VLMs for Transfer
Paper • 2412.03555 • Published • 112 -
VisionZip: Longer is Better but Not Necessary in Vision Language Models
Paper • 2412.04467 • Published • 99 -
o1-Coder: an o1 Replication for Coding
Paper • 2412.00154 • Published • 36
-
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Paper • 2410.03290 • Published • 6 -
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video
Paper • 2411.18671 • Published • 19 -
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
Paper • 2412.00927 • Published • 25
-
Training-free Long Video Generation with Chain of Diffusion Model Experts
Paper • 2408.13423 • Published • 22 -
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Paper • 2410.02757 • Published • 36 -
MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control
Paper • 2411.13807 • Published • 11 -
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video
Paper • 2411.18671 • Published • 19
-
DreamGaussian4D: Generative 4D Gaussian Splatting
Paper • 2312.17142 • Published • 18 -
Presto! Distilling Steps and Layers for Accelerating Music Generation
Paper • 2410.05167 • Published • 15 -
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction
Paper • 2410.04932 • Published • 9 -
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
Paper • 2410.11795 • Published • 16