Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2411.18671

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18 • 15
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Paper • 2401.09962 • Published Jan 18 • 8
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Paper • 2401.10404 • Published Jan 18 • 10
ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19 • 13

about 7 hours ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 166
PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published 12 days ago • 112
VisionZip: Longer is Better but Not Necessary in Vision Language Models

Paper • 2412.04467 • Published 11 days ago • 99
o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published 17 days ago • 36

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Paper • 2410.03290 • Published Oct 4 • 6
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video

Paper • 2411.18671 • Published 19 days ago • 19
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Paper • 2412.00927 • Published 15 days ago • 25

Training-free Long Video Generation with Chain of Diffusion Model Experts

Paper • 2408.13423 • Published Aug 24 • 22
Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Paper • 2410.02757 • Published Oct 3 • 36
MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control

Paper • 2411.13807 • Published 25 days ago • 11
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video

Paper • 2411.18671 • Published 19 days ago • 19

DreamGaussian4D: Generative 4D Gaussian Splatting

Paper • 2312.17142 • Published Dec 28, 2023 • 18
Presto! Distilling Steps and Layers for Accelerating Music Generation

Paper • 2410.05167 • Published Oct 7 • 15
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction

Paper • 2410.04932 • Published Oct 7 • 9
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

Paper • 2410.11795 • Published Oct 15 • 16

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs