Negative Token Merging: Image-based Adversarial Feature Guidance Paper • 2412.01339 • Published Dec 2, 2024 • 22
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published 30 days ago • 119
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters Paper • 2412.00174 • Published Nov 29, 2024 • 22
Open-Sora Plan: Open-Source Large Video Generation Model Paper • 2412.00131 • Published Nov 28, 2024 • 32
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published Nov 28, 2024 • 13
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model Paper • 2411.19108 • Published Nov 28, 2024 • 17
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models Paper • 2411.18613 • Published Nov 27, 2024 • 50
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published Nov 26, 2024 • 76
MyTimeMachine: Personalized Facial Age Transformation Paper • 2411.14521 • Published Nov 21, 2024 • 20
AnimateAnything: Consistent and Controllable Animation for Video Generation Paper • 2411.10836 • Published Nov 16, 2024 • 23
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models Paper • 2411.07126 • Published Nov 11, 2024 • 28
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality Paper • 2410.19355 • Published Oct 25, 2024 • 23
VidPanos: Generative Panoramic Videos from Casual Panning Videos Paper • 2410.13832 • Published Oct 17, 2024 • 12
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated Oct 15, 2024 • 149
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations Paper • 2410.10792 • Published Oct 14, 2024 • 29
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper • 2410.10306 • Published Oct 14, 2024 • 54
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation Paper • 2410.07171 • Published Oct 9, 2024 • 41
TextToon: Real-Time Text Toonify Head Avatar from Single Video Paper • 2410.07160 • Published Sep 23, 2024 • 8
TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation Paper • 2410.05591 • Published Oct 8, 2024 • 13