Nested Attention: Semantic-aware Attention Values for Concept Personalization
Abstract
Personalizing text-to-image models to generate images of specific subjects across diverse scenes and styles is a rapidly advancing field. Current approaches often face challenges in maintaining a balance between identity preservation and alignment with the input text prompt. Some methods rely on a single textual token to represent a subject, which limits expressiveness, while others employ richer representations but disrupt the model's prior, diminishing prompt alignment. In this work, we introduce Nested Attention, a novel mechanism that injects a rich and expressive image representation into the model's existing cross-attention layers. Our key idea is to generate query-dependent subject values, derived from nested attention layers that learn to select relevant subject features for each region in the generated image. We integrate these nested layers into an encoder-based personalization method, and show that they enable high identity preservation while adhering to input text prompts. Our approach is general and can be trained on various domains. Additionally, its prior preservation allows us to combine multiple personalized subjects from different domains in a single image.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models (2024)
- Foundation Cures Personalization: Recovering Facial Personalized Models' Prompt Consistency (2024)
- Personalized Large Vision-Language Models (2024)
- FashionComposer: Compositional Fashion Image Generation (2024)
- DECOR:Decomposition and Projection of Text Embeddings for Text-to-Image Customization (2024)
- Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis (2024)
- Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper