Usage with diffusers
Hello, I am trying to get this model to run using the diffusers StableDiffusionPipeline/StableDiffusionImg2ImgPipeline. As best I can tell, there are two formats for models - CKPT and then "directory with model_info.json" - and diffusers only supports loading the latter using from_pretrained
- is there anything I'm missing or straightforward script that could be added to the documentation that shows how to use this model with the diffusers package?
EDIT: Thank you so much for this work!
I haven't tried it myself, but it looks like there is a PR for checkpoint conversion:
https://github.com/huggingface/diffusers/pull/154
That's great, is there any chance you could provide the "YAML config file corresponding to the original architecture"? It's a required argument for that script.
https://github.com/CompVis/stable-diffusion/blob/main/configs/stable-diffusion/v1-inference.yaml
v1-inference.yaml from the original SD repo should work.
Thanks! Unfortunately, it seems like that YAML is missing configs for {'safety_checker', 'text_encoder', 'vae', 'feature_extractor'}
- I get the following when trying to initialize a pipeline with the model that the script generates:
TypeError: init() missing 4 required positional arguments: 'vae', 'text_encoder', 'safety_checker', and 'feature_extractor'
Any thoughts? Thanks so much for your help!
Can you try downloading the original (diffusers) SD1.4 repo and merge the missing/required files?
This repo is intended to be used with latent-diffusion/stable-diffusion repos rather than diffusers, and if anyone is willing to upload a converted checkpoint I'd welcome.
Hi,
@ayan4m1
@naclbit
I converted the model a while ago, and if I remember correctly, I think the following is what I did after running the script mentioned above:
- Rename
vqvae
tovae
andbert
totext_encoder
in the config file. - Rename the directories
vqvae
tovae
andbert
totext_encoder
. - Copy the directories
feature_extractor
andsafety_checker
from the directory of diffusers' version of Stable Diffusion model.
The converted model seems to be working fine, but I haven't checked if the generated results are the same with the original and converted models.
Hope this helps.
@ayan4m1
FYI, I realized that you don't have to manually rename or copy files by applying the following patch to the ldm-txt2im-conv-script
branch of diffusers
.
--- a/scripts/convert_ldm_txt2img_original_checkpoint_to_diffusers.py
+++ b/scripts/convert_ldm_txt2img_original_checkpoint_to_diffusers.py
@@ -22,9 +22,10 @@ try:
except ImportError:
raise ImportError("OmegaConf is required to convert the LDM checkpoints. Please install it with `pip install OmegaConf`.")
-from transformers import BertTokenizerFast, CLIPTokenizer, CLIPTextModel
-from diffusers import LDMTextToImagePipeline, AutoencoderKL, UNet2DConditionModel, DDIMScheduler
+from transformers import BertTokenizerFast, CLIPFeatureExtractor, CLIPTokenizer, CLIPTextModel
+from diffusers import StableDiffusionPipeline, AutoencoderKL, UNet2DConditionModel, DDIMScheduler
from diffusers.pipelines.latent_diffusion.pipeline_latent_diffusion import LDMBertModel, LDMBertConfig
+from diffusers.pipelines.stable_diffusion import StableDiffusionSafetyChecker
def shave_segments(path, n_shave_prefix_segments=1):
@@ -595,6 +596,8 @@ if __name__ == "__main__":
tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
scheduler = create_diffusers_schedular(original_config)
- pipe = LDMTextToImagePipeline(vqvae=vae, bert=text_model, tokenizer=tokenizer, unet=unet, scheduler=scheduler)
+ safety_checker = StableDiffusionSafetyChecker.from_pretrained('CompVis/stable-diffusion-safety-checker')
+ feature_extractor = CLIPFeatureExtractor()
+ pipe = StableDiffusionPipeline(vae=vae, text_encoder=text_model, tokenizer=tokenizer, unet=unet, scheduler=scheduler, safety_checker=safety_checker, feature_extractor=feature_extractor)
pipe.save_pretrained(args.dump_path)
Related issue: https://github.com/huggingface/diffusers/issues/491
Hey,
Happy to help with the conversion to diffusers
if you want!
@hysts - should we maybe do a clean stable diffusion script that makes conversion a bit easier? Happy to help with a PR that adds a clean script
I think it would make sense to put everything in the same repo or else we could also have two branches "compvis"
and "diffusers"
- up to you!
@patrickvonplaten
Yes I agree that two branches would be more convenient given that diffusers port is technically a conversion, and some people may want to continue training from the checkpoints.
Cool - should we go for two branches then?
@naclbit
you have 3 options I guess :sweat:
- The "main" branch stays this model and we add a "diffusers" branch (and potentially add a copy of "main" to a "diffusers" branch)
- The "main" branch becomes "diffusers" and we add a "compvis" branch (and potentially add a copy of "main" to a "diffusers" branch)
Maybe easiest to go for 1) which would essentially mean to just add a diffusers
branch
@patrickvonplaten
The option 1 sounds like a stress-free way to handle this.
Super - @ayan4m1 would you like to open a PR with the converted weights? Then I could somewhat manually move the weigths into a branch (sadly we cannot do this in an easy way yet)
Pushed the diffusers model to https://huggingface.co/ayan4m1/trinart_diffusers_v2 - feel free to pull into your diffusers branch, I couldn't figure out how to open a PR against any branch other than main.
Working on trimming it down to be a bit smaller, but at least it works for now.
Thanks a lot
@ayan4m1
- I've used your checkpoitns and added it and two newly converted checkpoints to this repo as discussed. It should now be trivially easy to use them with diffusers
:
from torch import autocast
from diffusers import StableDiffusionPipeline
model_id = "naclbit/trinart_stable_diffusion_v2"
device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained(model_id)
pipe = pipe.to(device)
prompt = "A magical dragon flying in front of the Himalaya in manga style"
with autocast("cuda"):
image = pipe(prompt).images[0]
By default, I've added the K-LMS sampler as proposed in the README.
Also opened a PR to add some examples to the README: https://huggingface.co/naclbit/trinart_stable_diffusion_v2/discussions/4
BTW, it's really easy to covert the model now with: https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py :-)
This is a minor thing, but I think the branch needs to be specified as follows in the example code above:
pipe = StableDiffusionPipeline.from_pretrained(model_id, revision='diffusers-115k')
Ah, but the example code in https://huggingface.co/naclbit/trinart_stable_diffusion_v2/discussions/4 is correct, so it doesn't matter.
Thanks all for the help here, closing this as we've solved the problem I came in with.