h1t
/

TCD-SDXL-LoRA

Text-to-Image

Diffusers

lora

Model card Files Files and versions Community

h1t commited on Mar 1, 2024

Commit

80cd908

verified ·

1 Parent(s): 9606b47

Update README.md

Browse files

Files changed (1) hide show

README.md +68 -17

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ inference: false
 Official Model Repo of the paper: [Trajectory Consistency Distillation](https://arxiv.org/abs/2402.19159).
 For more information, please check the [GitHub Repo](https://github.com/jabir-zheng/TCD) and [Project Page](https://mhh0318.github.io/tcd/).
-Also welcome to try the demo host on [🤗 Space(https://huggingface.co/spaces/h1t/TCD)].
 ![](./assets/teaser_fig.png)
@@ -48,7 +48,7 @@ And then we clone the repo.
 git clone https://github.com/jabir-zheng/TCD.git
 cd TCD
 ```
-Here, we demonstrate the applicability of our TCD LoRA to various models, including [SDXL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), [SDXL Inpainting](https://huggingface.co/diffusers/stable-diffusion-xl-1.0-inpainting-0.1), a community model named [Animagine XL](https://huggingface.co/cagliostrolab/animagine-xl-3.0), a styled LoRA [Papercut](https://huggingface.co/TheLastBen/Papercut_SDXL), pretrained [Depth Controlnet](https://huggingface.co/diffusers/controlnet-depth-sdxl-1.0), and [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter) to accelerate image generation with high quality in 4-8 steps.
 ### Text-to-Image generation
 ```py
@@ -58,7 +58,7 @@ from scheduling_tcd import TCDScheduler
 device = "cuda"
 base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
-tcd_lora_id = ""
 pipe = StableDiffusionXLPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to(device)
 pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
@@ -79,7 +79,7 @@ image = pipe(
     generator=torch.Generator(device=device).manual_seed(0),
 ).images[0]
 ```
-![](./assets/t2i_sdxl.png)
 ### Inpainting
 ```py
@@ -90,7 +90,7 @@ from scheduling_tcd import TCDScheduler
 device = "cuda"
 base_model_id = "diffusers/stable-diffusion-xl-1.0-inpainting-0.1"
-tcd_lora_id = ""
 pipe = AutoPipelineForInpainting.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to(device)
 pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
@@ -119,7 +119,7 @@ image = pipe(
 grid_image = make_image_grid([init_image, mask_image, image], rows=1, cols=3)
 ```
-![](./assets/inpainting_sdxl.png)
 ### Versatile for Community Models
 ```py
@@ -129,7 +129,7 @@ from scheduling_tcd import TCDScheduler
 device = "cuda"
 base_model_id = "cagliostrolab/animagine-xl-3.0"
-tcd_lora_id = ""
 pipe = StableDiffusionXLPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to(device)
 pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
@@ -160,7 +160,7 @@ from scheduling_tcd import TCDScheduler
 device = "cuda"
 base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
-tcd_lora_id = ""
 styled_lora_id = "TheLastBen/Papercut_SDXL"
 pipe = StableDiffusionXLPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to(device)
@@ -186,6 +186,7 @@ image = pipe(
 ![](./assets/styled_lora.png)
 ### Compatibility with ControlNet
 ```py
 import torch
 import numpy as np
@@ -220,8 +221,8 @@ def get_depth_map(image):
     return image
 base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
-controlnet_id = "/mnt/CV_teamz/pretrained/controlnet-depth-sdxl-1.0"
-tcd_lora_id = ""
 controlnet = ControlNetModel.from_pretrained(
     controlnet_id,
@@ -260,15 +261,65 @@ image = pipe(
 grid_image = make_image_grid([depth_image, image], rows=1, cols=2)
 ```
-![](./assets/controlnet_depth_sdxl.png)
 ### Compatibility with IP-Adapter
-Please refer to the official [repository](https://github.com/tencent-ailab/IP-Adapter/tree/main) for instructions on installing dependencies for IP-Adapter.
 ```py
 import torch
-from PIL import Image
 from diffusers import StableDiffusionXLPipeline
-from diffusers.utils import make_image_grid
 from ip_adapter import IPAdapterXL
 from scheduling_tcd import TCDScheduler
@@ -277,7 +328,7 @@ device = "cuda"
 base_model_path = "stabilityai/stable-diffusion-xl-base-1.0"
 image_encoder_path = "sdxl_models/image_encoder"
 ip_ckpt = "sdxl_models/ip-adapter_sdxl.bin"
-tcd_lora_id = ""
 pipe = StableDiffusionXLPipeline.from_pretrained(
     base_model_path,
@@ -291,8 +342,7 @@ pipe.fuse_lora()
 ip_model = IPAdapterXL(pipe, image_encoder_path, ip_ckpt, device)
-ref_image = Image.open(f"assets/images/woman.png")
-ref_image.resize((512, 512))
 prompt = "best quality, high quality, wearing sunglasses"
@@ -311,6 +361,7 @@ grid_image = make_image_grid([ref_image, image], rows=1, cols=2)
 ```
 ![](./assets/ip_adapter.png)
 ## Citation
 ```bibtex
 @misc{zheng2024trajectory,

 Official Model Repo of the paper: [Trajectory Consistency Distillation](https://arxiv.org/abs/2402.19159).
 For more information, please check the [GitHub Repo](https://github.com/jabir-zheng/TCD) and [Project Page](https://mhh0318.github.io/tcd/).
+Also welcome to try the demo host on [🤗 Space](https://huggingface.co/spaces/h1t/TCD).
 ![](./assets/teaser_fig.png)
 git clone https://github.com/jabir-zheng/TCD.git
 cd TCD
 ```
+Here, we demonstrate the applicability of our TCD LoRA to various models, including [SDXL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), [SDXL Inpainting](https://huggingface.co/diffusers/stable-diffusion-xl-1.0-inpainting-0.1), a community model named [Animagine XL](https://huggingface.co/cagliostrolab/animagine-xl-3.0), a styled LoRA [Papercut](https://huggingface.co/TheLastBen/Papercut_SDXL), pretrained [Depth Controlnet](https://huggingface.co/diffusers/controlnet-depth-sdxl-1.0), [Canny Controlnet](https://huggingface.co/diffusers/controlnet-canny-sdxl-1.0) and [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter) to accelerate image generation with high quality in few steps.
 ### Text-to-Image generation
 ```py
 device = "cuda"
 base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
+tcd_lora_id = "h1t/TCD-SDXL-LoRA"
 pipe = StableDiffusionXLPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to(device)
 pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
     generator=torch.Generator(device=device).manual_seed(0),
 ).images[0]
 ```
+![](./assets/t2i_tcd.png)
 ### Inpainting
 ```py
 device = "cuda"
 base_model_id = "diffusers/stable-diffusion-xl-1.0-inpainting-0.1"
+tcd_lora_id = "h1t/TCD-SDXL-LoRA"
 pipe = AutoPipelineForInpainting.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to(device)
 pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
 grid_image = make_image_grid([init_image, mask_image, image], rows=1, cols=3)
 ```
+![](./assets/inpainting_tcd.png)
 ### Versatile for Community Models
 ```py
 device = "cuda"
 base_model_id = "cagliostrolab/animagine-xl-3.0"
+tcd_lora_id = "h1t/TCD-SDXL-LoRA"
 pipe = StableDiffusionXLPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to(device)
 pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
 device = "cuda"
 base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
+tcd_lora_id = "h1t/TCD-SDXL-LoRA"
 styled_lora_id = "TheLastBen/Papercut_SDXL"
 pipe = StableDiffusionXLPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to(device)
 ![](./assets/styled_lora.png)
 ### Compatibility with ControlNet
+#### Depth ControlNet
 ```py
 import torch
 import numpy as np
     return image
 base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
+controlnet_id = "diffusers/controlnet-depth-sdxl-1.0"
+tcd_lora_id = "h1t/TCD-SDXL-LoRA"
 controlnet = ControlNetModel.from_pretrained(
     controlnet_id,
 grid_image = make_image_grid([depth_image, image], rows=1, cols=2)
 ```
+![](./assets/controlnet_depth_tcd.png)
+#### Canny ControlNet
+```py
+import torch
+from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline
+from diffusers.utils import load_image, make_image_grid
+from scheduling_tcd import TCDScheduler
+device = "cuda"
+base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
+controlnet_id = "diffusers/controlnet-canny-sdxl-1.0"
+tcd_lora_id = "h1t/TCD-SDXL-LoRA"
+controlnet = ControlNetModel.from_pretrained(
+    controlnet_id,
+    torch_dtype=torch.float16,
+    variant="fp16",
+).to(device)
+pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
+    base_model_id,
+    controlnet=controlnet,
+    torch_dtype=torch.float16,
+    variant="fp16",
+).to(device)
+pipe.enable_model_cpu_offload()
+pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
+pipe.load_lora_weights(tcd_lora_id)
+pipe.fuse_lora()
+prompt = "ultrarealistic shot of a furry blue bird"
+canny_image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/bird_canny.png")
+controlnet_conditioning_scale = 0.5  # recommended for good generalization
+image = pipe(
+    prompt,
+    image=canny_image,
+    num_inference_steps=4,
+    guidance_scale=0,
+    eta=0.3, # A parameter (referred to as `gamma` in the paper) is used to control the stochasticity in every step. A value of 0.3 often yields good results.
+    controlnet_conditioning_scale=controlnet_conditioning_scale,
+    generator=torch.Generator(device=device).manual_seed(0),
+).images[0]
+grid_image = make_image_grid([canny_image, image], rows=1, cols=2)
+```
+![](./assets/controlnet_canny_tcd.png)
 ### Compatibility with IP-Adapter
+⚠️ Please refer to the official [repository](https://github.com/tencent-ailab/IP-Adapter/tree/main) for instructions on installing dependencies for IP-Adapter.
 ```py
 import torch
 from diffusers import StableDiffusionXLPipeline
+from diffusers.utils import load_image, make_image_grid
 from ip_adapter import IPAdapterXL
 from scheduling_tcd import TCDScheduler
 base_model_path = "stabilityai/stable-diffusion-xl-base-1.0"
 image_encoder_path = "sdxl_models/image_encoder"
 ip_ckpt = "sdxl_models/ip-adapter_sdxl.bin"
+tcd_lora_id = "h1t/TCD-SDXL-LoRA"
 pipe = StableDiffusionXLPipeline.from_pretrained(
     base_model_path,
 ip_model = IPAdapterXL(pipe, image_encoder_path, ip_ckpt, device)
+ref_image = load_image("https://raw.githubusercontent.com/tencent-ailab/IP-Adapter/main/assets/images/woman.png").resize((512, 512))
 prompt = "best quality, high quality, wearing sunglasses"
 ```
 ![](./assets/ip_adapter.png)
 ## Citation
 ```bibtex
 @misc{zheng2024trajectory,