Collections
Discover the best community collections!
Collections including paper arxiv:2404.19759
-
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Paper • 2405.01535 • Published • 106 -
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
Paper • 2405.01434 • Published • 49 -
WildChat: 1M ChatGPT Interaction Logs in the Wild
Paper • 2405.01470 • Published • 57 -
A Careful Examination of Large Language Model Performance on Grade School Arithmetic
Paper • 2405.00332 • Published • 30
-
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation
Paper • 2404.19427 • Published • 69 -
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
Paper • 2404.19759 • Published • 24 -
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
Paper • 2404.19752 • Published • 20 -
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting
Paper • 2404.19758 • Published • 10
-
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Paper • 2403.16990 • Published • 24 -
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 48 -
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 29 -
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Paper • 2404.01367 • Published • 19
-
SDXL-Lightning: Progressive Adversarial Diffusion Distillation
Paper • 2402.13929 • Published • 26 -
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation
Paper • 2403.12015 • Published • 60 -
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
Paper • 2404.19759 • Published • 24
-
Explorative Inbetweening of Time and Space
Paper • 2403.14611 • Published • 10 -
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Paper • 2403.01422 • Published • 24 -
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation
Paper • 2402.11929 • Published • 9 -
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Paper • 2403.14773 • Published • 8
-
Video as the New Language for Real-World Decision Making
Paper • 2402.17139 • Published • 18 -
Learning and Leveraging World Models in Visual Representation Learning
Paper • 2403.00504 • Published • 26 -
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Paper • 2403.01422 • Published • 24 -
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models
Paper • 2403.05438 • Published • 15
-
Training-Free Consistent Text-to-Image Generation
Paper • 2402.03286 • Published • 62 -
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation
Paper • 2402.04324 • Published • 22 -
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
Paper • 2402.05195 • Published • 16 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 23 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 111 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 71 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 31