DiffBlender Model Card
This repo contains the models from our paper DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models.
Model details
Model type: DiffBlender successfully synthesizes complex combinations of input modalities. It enables flexible manipulation of conditions, providing the customized generation aligned with user preferences. We designed its structure to intuitively extend to additional modalities while achieving a low training cost through a partial update of hypernetworks.
We provide its model checkpoint, trained with six modalities: sketch, depth map, grounding box, keypoints, color palette, and style embedding. >> ./checkpoint_latest.pth
License: Apache 2.0 License
Where to send questions or comments about the model: https://github.com/sungnyun/diffblender/issues
Training dataset
More detials are in our project page, https://sungnyun.github.io/diffblender/.