Colorful Diffuse Intrinsic Image Decomposition in the Wild Paper โข 2409.13690 โข Published 10 days ago โข 12
Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot Videos Paper โข 2312.10300 โข Published Dec 16, 2023 โข 1
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis Paper โข 2403.03206 โข Published Mar 5 โข 56
view article Article Optimum-NVIDIA - Unlock blazingly fast LLM inference in just 1 line of code Dec 5, 2023 โข 4
IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts Paper โข 2408.03209 โข Published Aug 6 โข 21
ViPer: Visual Personalization of Generative Models via Individual Preference Learning Paper โข 2407.17365 โข Published Jul 24 โข 11
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency Paper โข 2407.17470 โข Published Jul 24 โข 14
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition Paper โข 2402.15504 โข Published Feb 23 โข 21
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors Paper โข 2310.12190 โข Published Oct 18, 2023 โข 10
InstructVideo: Instructing Video Diffusion Models with Human Feedback Paper โข 2312.12490 โข Published Dec 19, 2023 โข 17
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper โข 2307.01952 โข Published Jul 4, 2023 โข 80
GenEval: An Object-Focused Framework for Evaluating Text-to-Image Alignment Paper โข 2310.11513 โข Published Oct 17, 2023 โข 1
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Paper โข 2311.05556 โข Published Nov 9, 2023 โข 79
SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion Paper โข 2403.12008 โข Published Mar 18 โข 19
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty Paper โข 2407.06071 โข Published Jul 8 โข 7
VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild Paper โข 2211.14758 โข Published Nov 27, 2022 โข 1
Guiding a Diffusion Model with a Bad Version of Itself Paper โข 2406.02507 โข Published Jun 4 โข 15
SDXL-Lightning: Progressive Adversarial Diffusion Distillation Paper โข 2402.13929 โข Published Feb 21 โข 27
Guiding Instruction-based Image Editing via Multimodal Large Language Models Paper โข 2309.17102 โข Published Sep 29, 2023 โข 3
Revisiting Feature Prediction for Learning Visual Representations from Video Paper โข 2404.08471 โข Published Feb 15 โข 1
Jina CLIP: Your CLIP Model Is Also Your Text Retriever Paper โข 2405.20204 โข Published May 30 โข 29
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture Paper โข 2301.08243 โข Published Jan 19, 2023 โข 6
Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control Paper โข 2405.17414 โข Published May 27 โข 10
BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing Paper โข 2305.14720 โข Published May 24, 2023 โข 2
Transformers Can Do Arithmetic with the Right Embeddings Paper โข 2405.17399 โข Published May 27 โข 51
Learning Transferable Visual Models From Natural Language Supervision Paper โข 2103.00020 โข Published Feb 26, 2021 โข 11
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers Paper โข 2106.10270 โข Published Jun 18, 2021 โข 2
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Paper โข 2405.01434 โข Published May 2 โข 51
view article Article Training Stable Diffusion with Dreambooth using ๐งจ Diffusers Nov 7, 2022 โข 14
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models Paper โข 2112.10741 โข Published Dec 20, 2021 โข 3
V3D: Video Diffusion Models are Effective 3D Generators Paper โข 2403.06738 โข Published Mar 11 โข 28
Speculative Streaming: Fast LLM Inference without Auxiliary Models Paper โข 2402.11131 โข Published Feb 16 โข 41