An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels Paper • 2406.09415 • Published Jun 13, 2024 • 50
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling Paper • 2405.21048 • Published May 31, 2024 • 14
SpatialTracker: Tracking Any 2D Pixels in 3D Space Paper • 2404.04319 • Published Apr 5, 2024 • 24
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework Paper • 2403.13248 • Published Mar 20, 2024 • 78
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models Paper • 2403.05438 • Published Mar 8, 2024 • 18
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on Paper • 2403.01779 • Published Mar 4, 2024 • 28
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Paper • 2402.17177 • Published Feb 27, 2024 • 88
DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Model Paper • 2402.17412 • Published Feb 27, 2024 • 21
HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting Paper • 2402.06149 • Published Feb 9, 2024 • 17
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild Paper • 2401.13627 • Published Jan 24, 2024 • 73
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation Paper • 2309.16653 • Published Sep 28, 2023 • 46