Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient Paper • 2411.17787 • Published 2 days ago • 9
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published 3 days ago • 57
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator Paper • 2411.15466 • Published 6 days ago • 33
Style-Friendly SNR Sampler for Style-Driven Generation Paper • 2411.14793 • Published 7 days ago • 35
OminiControl: Minimal and Universal Control for Diffusion Transformer Paper • 2411.15098 • Published 6 days ago • 38
OminiControl: Minimal and Universal Control for Diffusion Transformer Paper • 2411.15098 • Published 6 days ago • 38 • 3
Attention Prompting on Image for Large Vision-Language Models Paper • 2409.17143 • Published Sep 25 • 7