How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
Abstract
Custom diffusion models (CDMs) have attracted widespread attention due to their astonishing <PRE_TAG>generative ability</POST_TAG> for <PRE_TAG>personalized concepts</POST_TAG>. However, most existing CDMs unreasonably assume that <PRE_TAG>personalized concepts</POST_TAG> are fixed and cannot change over time. Moreover, they heavily suffer from catastrophic forgetting and <PRE_TAG>concept neglect</POST_TAG> on old <PRE_TAG>personalized concepts</POST_TAG> when continually learning a series of new concepts. To address these challenges, we propose a novel <PRE_TAG>Concept-Incremental text-to-image Diffusion Model (CIDM)</POST_TAG>, which can resolve <PRE_TAG>catastrophic forgetting</POST_TAG> and <PRE_TAG>concept neglect</POST_TAG> to learn new customization tasks in a concept-incremental manner. Specifically, to surmount the <PRE_TAG>catastrophic forgetting</POST_TAG> of old concepts, we develop a concept consolidation loss and an <PRE_TAG>elastic weight aggregation module</POST_TAG>. They can explore <PRE_TAG>task-specific</POST_TAG> and task-shared knowledge during training, and aggregate all low-rank weights of old concepts based on their contributions during inference. Moreover, in order to address <PRE_TAG>concept neglect</POST_TAG>, we devise a context-controllable synthesis strategy that leverages <PRE_TAG>expressive region features</POST_TAG> and noise estimation to control the contexts of generated images according to user conditions. Experiments validate that our CIDM surpasses existing custom diffusion models. The source codes are available at https://github.com/JiahuaDong/CIFC.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper