update readme

Browse files

Files changed (16) hide show

.gitattributes +15 -0
README.md +110 -33
control_bird_canny.png +0 -0
images/model_load_performance.png +3 -0
images/sd_controlnet_txt2img.png +3 -0
images/sd_txt2img.png +3 -0
images/sdxl_controlnet_txt2img.png +3 -0
images/sdxl_txt2img.png +3 -0
outputs/res_controlnet_img2img_0.png +0 -0
outputs/res_controlnet_sdxl_txt2img.png +3 -0
outputs/res_controlnet_txt2img_0.png +0 -0
outputs/res_img2img_0.png +0 -0
outputs/res_sdxl_txt2img_0.png +3 -0
outputs/res_sdxl_txt2img_lora_0.png +3 -0
outputs/res_txt2img_0.png +0 -0
outputs/res_txt2img_lora_0.png +0 -0

.gitattributes CHANGED Viewed

@@ -32,8 +32,23 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 lyrasd_model/lyrasd_lib/libth_lyrasd_cu11_sm80.so filter=lfs diff=lfs merge=lfs -text
 lyrasd_model/lyrasd_lib/libth_lyrasd_cu11_sm86.so filter=lfs diff=lfs merge=lfs -text
 lyrasd_model/lyrasd_lib/libth_lyrasd_cu12_sm80.so filter=lfs diff=lfs merge=lfs -text
 lyrasd_model/lyrasd_lib/libth_lyrasd_cu12_sm86.so filter=lfs diff=lfs merge=lfs -text

 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 lyrasd_model/lyrasd_lib/libth_lyrasd_cu11_sm80.so filter=lfs diff=lfs merge=lfs -text
 lyrasd_model/lyrasd_lib/libth_lyrasd_cu11_sm86.so filter=lfs diff=lfs merge=lfs -text
 lyrasd_model/lyrasd_lib/libth_lyrasd_cu12_sm80.so filter=lfs diff=lfs merge=lfs -text
 lyrasd_model/lyrasd_lib/libth_lyrasd_cu12_sm86.so filter=lfs diff=lfs merge=lfs -text
+control_bird_canny.png filter=lfs diff=lfs merge=lfs -text
+images/sdxl_controlnet_txt2img.png filter=lfs diff=lfs merge=lfs -text
+outputs/res_controlnet_sdxl_txt2img.png filter=lfs diff=lfs merge=lfs -text
+outputs/res_controlnet_txt2img_0.png filter=lfs diff=lfs merge=lfs -text
+outputs/res_img2img_0.png filter=lfs diff=lfs merge=lfs -text
+outputs/res_sdxl_txt2img_0.png filter=lfs diff=lfs merge=lfs -text
+images/sd_controlnet_txt2img.png filter=lfs diff=lfs merge=lfs -text
+images/sd_txt2img.png filter=lfs diff=lfs merge=lfs -text
+outputs/res_controlnet_img2img_0.png filter=lfs diff=lfs merge=lfs -text
+outputs/res_sdxl_txt2img_lora_0.png filter=lfs diff=lfs merge=lfs -text
+outputs/res_txt2img_0.png filter=lfs diff=lfs merge=lfs -text
+outputs/res_txt2img_lora_0.png filter=lfs diff=lfs merge=lfs -text
+images/model_load_performance.png filter=lfs diff=lfs merge=lfs -text
+images/sdxl_txt2img.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -10,49 +10,61 @@ tags:
 We consider the Diffusers as the much more extendable framework for the SD ecosystem. Therefore, we have made a **pivot to Diffusers**, leading to a complete update of lyraSD.
-lyraSD is currently the **fastest Stable Diffusion model** that can 100% align the outputs of **Diffusers** available, boasting an inference cost of only **0.52 seconds** for a 512x512 image, accelerating the process up to **80% faster** than the original version.
 Among its main features are:
-- **ControlNet Hot Swap**: Can hot swap a ControlNet model weights within 0.6s (0s if cached)
-- **LoRA Hot Swap**: Can hot swap a Lora within 0.5s (0.1s if cached)
-- 100% likeness to diffusers output
-- 4 Commonly used Pipelines
 - - Text2Img
 - - Img2Img
 - - ControlNetText2Img
 - - ControlNetImg2Img
-- Supported Devices: Any GPU with SM version >= 75. For example, Nvidia Turing architecture(T4), Nvidia Ampere architecture (A2, A10, A16, A30, A40, A100), RTX 4090, 3080 and etc.
 ## Speed
 ### test environment
-- device: Nvidia A100 40G
-- img size: 512x512
-- percision:fp16
-- steps: 20
-- sampler: EulerA
-### Text2Img
-|model|time cost(ms)|
-|:-:|:-:|
-|torch2.0.1 + diffusers|~667ms|
-|lyraSD|~528ms|
-### ControlNet-Text2Img
-|model|time cost(ms)|
-|:-:|:-:|
-|torch2.0.1 + diffusers|~930ms|
-|lyraSD|~745ms|
 ## Model Sources
 - **Checkpoint:** https://civitai.com/models/7371/rev-animated
 - **ControlNet:** https://huggingface.co/lllyasviel/sd-controlnet-canny
 - **Lora:** https://civitai.com/models/18323?modelVersionId=46846
-## Text2Img Uses
 ```python
 import torch
@@ -60,7 +72,7 @@ import time
 from lyrasd_model import LyraSdTxt2ImgPipeline
-# 存放模型文件的路径，应该包含一下结构：
 #   1. clip 模型
 #   2. 转换好的优化后的 unet 模型，放入其中的 unet_bins 文件夹
 #   3. vae 模型
@@ -75,8 +87,8 @@ lora_path = "./models/lyrasd_xiaorenshu_lora"
 model = LyraSdTxt2ImgPipeline(model_path, lib_path)
 # load lora
-# 参数分别为 lora 存放位置，名字，lora 强度，lora模型精度
-model.load_lora(lora_path, "xiaorenshu", 0.4, "fp32")
 # 准备应用的输入和超参数
 prompt = "a cat, cute, cartoon, concise, traditional, chinese painting, Tang and Song Dynasties, masterpiece, 4k, 8k, UHD, best quality"
@@ -97,36 +109,101 @@ print("image gen cost: ",time.perf_counter() - start)
 for i, image in enumerate(images):
     image.save(f"outputs/res_txt2img_lora_{i}.png")
-# unload lora，参数为 lora 的名字，是否清除 lora 缓存
-# model.unload_lora("xiaorenshu", True)
 ```
 ## Demo output
 ### Text2Img
-#### Text2Img without Lora
 ![text2img_demo](./outputs/res_txt2img_0.png)
-#### Text2Img with Lora
 ![text2img_demo](./outputs/res_txt2img_lora_0.png)
-### Img2Img
 #### Img2Img input
 <img src="https://chuangxin-research-1258344705.cos.ap-guangzhou.myqcloud.com/share/files/seaside_town.png?q-sign-algorithm=sha1&q-ak=AKIDBF6i7GCtKWS8ZkgOtACzX3MQDl37xYty&q-sign-time=1692601590;1865401590&q-key-time=1692601590;1865401590&q-header-list=&q-url-param-list=&q-signature=ca04ca92d990d94813029c0d9ef29537e5f4637c" alt="img2img input" width="512"/>
 #### Img2Img output
-![text2img_demo](./outputs/res_img2img_0.png)
 ### ControlNet Text2Img
 #### Control Image
 ![text2img_demo](./control_bird_canny.png)
-#### ControlNet Text2Img Output
 ![text2img_demo](./outputs/res_controlnet_txt2img_0.png)
 ## Docker Environment Recommendation
 - For Cuda 11.X: we recommend ```nvcr.io/nvidia/pytorch:22.12-py3```
@@ -146,7 +223,7 @@ python txt2img_demo.py
   author =       {Kangjian Wu, Zhengtao Wang, Yibo Lu, Haoxiong Su, Bin Wu},
   title =        {lyraSD: Accelerating Stable Diffusion with best flexibility},
   howpublished = {\url{https://huggingface.co/TMElyralab/lyraSD}},
-  year =         {2023}
 }
 ```

 We consider the Diffusers as the much more extendable framework for the SD ecosystem. Therefore, we have made a **pivot to Diffusers**, leading to a complete update of lyraSD.
+lyraSD is currently the **fastest Stable Diffusion model** that can 100% align the outputs of **Diffusers** available, boasting an inference cost of only **0.36 seconds** for a 512x512 image, accelerating the process up to **50% faster** than the original version.
 Among its main features are:
+- **All Commonly used** SD1.5 and SDXL pipelines
 - - Text2Img
 - - Img2Img
+- - Inpainting
 - - ControlNetText2Img
 - - ControlNetImg2Img
+- - IpAdapterText2Img
+- **Fast ControlNet Hot Swap**: Can hot swap a ControlNet model weights within 0.6s
+- **Fast LoRA Hot Swap**: Can hot swap a Lora within 0.14s
+- 100% likeness to diffusers output
+- Supported Devices: Any GPU with SM version >= 80. For example, Nvidia Nvidia Ampere architecture (A2, A10, A16, A30, A40, A100), RTX 4090, 3080 and etc.
 ## Speed
 ### test environment
+- Device: Nvidia A100 40G
+- Nvidia driver version: 525.105.17
+- Nvidia cuda version: 12.0
+- Percision:fp16
+- Steps: 20
+- Sampler: EulerA
+### SD1.5 Text2Img Performance
+![Alt text](images/sd_txt2img.png)
+### SD1.5 ControlNet-Text2Img Performance
+![Alt text](images/sd_controlnet_txt2img.png)
+### SDXL Text2Img Performance
+![Alt text](images/sd_txt2img.png)
+### SDXL ControlNet-Text2Img Performance
+![Alt text](images/sdxl_controlnet_txt2img.png)
+### SD Model Load Performance
+![Alt text](images/model_load_performance.png)
 ## Model Sources
+SD1.5
 - **Checkpoint:** https://civitai.com/models/7371/rev-animated
 - **ControlNet:** https://huggingface.co/lllyasviel/sd-controlnet-canny
 - **Lora:** https://civitai.com/models/18323?modelVersionId=46846
+SDXL
+- **Checkpoint:** https://civitai.com/models/43977?modelVersionId=227916
+- **ControlNet:** https://huggingface.co/diffusers/controlnet-canny-sdxl-1.0
+- **Lora:** https://civitai.com/models/18323?modelVersionId=46846
+## SD1.5 Text2Img Uses
 ```python
 import torch
 from lyrasd_model import LyraSdTxt2ImgPipeline
+# 存放模型文件的路径，应该包含一下结构(和diffusers一致)：
 #   1. clip 模型
 #   2. 转换好的优化后的 unet 模型，放入其中的 unet_bins 文件夹
 #   3. vae 模型
 model = LyraSdTxt2ImgPipeline(model_path, lib_path)
 # load lora
+# lora model path, name，lora strength
+model.load_lora_v2(lora_path, "xiaorenshu", 0.4)
 # 准备应用的输入和超参数
 prompt = "a cat, cute, cartoon, concise, traditional, chinese painting, Tang and Song Dynasties, masterpiece, 4k, 8k, UHD, best quality"
 for i, image in enumerate(images):
     image.save(f"outputs/res_txt2img_lora_{i}.png")
+# unload lora，      lora’s name,  clear lora cache
+model.unload_lora_v2("xiaorenshu", True)
+```
+## SDXL Text2Img Uses
+```python
+import torch
+import time
+from lyrasd_model import LyraSdXLTxt2ImgPipeline
+# 存放模型文件的路径，应该包含一下结构：
+#   1. clip 模型
+#   2. 转换好的优化后的 unet 模型，放入其中的 unet_bins 文件夹
+#   3. vae 模型
+#   4. scheduler 配置
+# LyraSD 的 C++ 编译动态链接库，其中包含 C++ CUDA 计算的细节
+lib_path = "./lyrasd_model/lyrasd_lib/libth_lyrasd_cu11_sm80.so"
+model_path = "./models/lyrasd_helloworldSDXL20Fp16"
+lora_path = "./models/lyrasd_xiaorenshu_lora"
+# 构建 Txt2Img 的 Pipeline
+model = LyraSdXLTxt2ImgPipeline(model_path, lib_path)
+# load lora
+# lora model path, name，lora strength
+model.load_lora_v2(lora_path, "xiaorenshu", 0.4)
+# 准备应用的输入和超参数
+prompt = "a cat, cute, cartoon, concise, traditional, chinese painting, Tang and Song Dynasties, masterpiece, 4k, 8k, UHD, best quality"
+negative_prompt = "(((horrible))), (((scary))), (((naked))), (((large breasts))), high saturation, colorful, human:2, body:2, low quality, bad quality, lowres, out of frame, duplicate, watermark, signature, text, frames, cut, cropped, malformed limbs, extra limbs, (((missing arms))), (((missing legs)))"
+height, width = 512, 512
+steps = 30
+guidance_scale = 7
+generator = torch.Generator().manual_seed(123)
+num_images = 1
+start = time.perf_counter()
+# 推理生成
+images = model( prompt,
+                height=height,
+                width=width,
+                num_inference_steps=steps,
+                num_images_per_prompt=1,
+                guidance_scale=guidance_scale,
+                negative_prompt=negative_prompt,
+                generator=generator
+                )
+print("image gen cost: ",time.perf_counter() - start)
+# 存储生成的图片
+for i, image in enumerate(images):
+    image.save(f"outputs/res_txt2img_xl_lora_{i}.png")
+# unload lora，参数为 lora 的名字，是否清除 lora 缓存
+model.unload_lora_v2("xiaorenshu", True)
 ```
 ## Demo output
 ### Text2Img
+#### SD1.5 Text2Img
 ![text2img_demo](./outputs/res_txt2img_0.png)
+#### SD1.5  Text2Img with Lora
 ![text2img_demo](./outputs/res_txt2img_lora_0.png)
+#### SDXL Text2Img
+![text2img_demo](./outputs/res_sdxl_txt2img_0.png)
+#### SDXL Text2Img with Lora
+![text2img_demo](./outputs/res_sdxl_txt2img_lora_0.png)
+<!-- ### Img2Img
 #### Img2Img input
 <img src="https://chuangxin-research-1258344705.cos.ap-guangzhou.myqcloud.com/share/files/seaside_town.png?q-sign-algorithm=sha1&q-ak=AKIDBF6i7GCtKWS8ZkgOtACzX3MQDl37xYty&q-sign-time=1692601590;1865401590&q-key-time=1692601590;1865401590&q-header-list=&q-url-param-list=&q-signature=ca04ca92d990d94813029c0d9ef29537e5f4637c" alt="img2img input" width="512"/>
 #### Img2Img output
+![text2img_demo](./outputs/res_img2img_0.png) -->
 ### ControlNet Text2Img
 #### Control Image
 ![text2img_demo](./control_bird_canny.png)
+#### SD1.5 ControlNet Text2Img Output
 ![text2img_demo](./outputs/res_controlnet_txt2img_0.png)
+#### SDXL ControlNet Text2Img Output
+![text2img_demo](./outputs/res_controlnet_sdxl_txt2img.png)
 ## Docker Environment Recommendation
 - For Cuda 11.X: we recommend ```nvcr.io/nvidia/pytorch:22.12-py3```
   author =       {Kangjian Wu, Zhengtao Wang, Yibo Lu, Haoxiong Su, Bin Wu},
   title =        {lyraSD: Accelerating Stable Diffusion with best flexibility},
   howpublished = {\url{https://huggingface.co/TMElyralab/lyraSD}},
+  year =         {2024}
 }
 ```