How Far Are We from Intelligent Visual Deductive Reasoning? Paper • 2403.04732 • Published Mar 7, 2024 • 21
Scalable Pre-training of Large Autoregressive Image Models Paper • 2401.08541 • Published Jan 16, 2024 • 38
Learning Controllable 3D Diffusion Models from Single-view Images Paper • 2304.06700 • Published Apr 13, 2023
Stabilizing Transformer Training by Preventing Attention Entropy Collapse Paper • 2303.06296 • Published Mar 11, 2023
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping Paper • 2306.05544 • Published Jun 8, 2023 • 10
PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model Paper • 2306.02531 • Published Jun 5, 2023 • 1
Position Prediction as an Effective Pretraining Strategy Paper • 2207.07611 • Published Jul 15, 2022 • 1
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling Paper • 2405.21048 • Published May 31, 2024 • 16
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation Paper • 2410.08159 • Published Oct 10, 2024 • 25
World-consistent Video Diffusion with Explicit 3D Modeling Paper • 2412.01821 • Published Dec 2, 2024 • 4