|
--- |
|
license: creativeml-openrail-m |
|
language: |
|
- en |
|
tags: |
|
- art |
|
- Stable Diffusion |
|
--- |
|
## Model Card for lyraSD |
|
|
|
We consider the Diffusers as the much more extendable framework for the SD ecosystem. Therefore, we have made a **pivot to Diffusers**, leading to a complete update of lyraSD. |
|
|
|
lyraSD is currently the **fastest Stable Diffusion model** that can 100% align the outputs of **Diffusers** available, boasting an inference cost of only **0.36 seconds** for a 512x512 image, accelerating the process up to **50% faster** than the original version. |
|
|
|
Among its main features are: |
|
|
|
- **All Commonly used** SD1.5 and SDXL pipelines |
|
- - Text2Img |
|
- - Img2Img |
|
- - Inpainting |
|
- - ControlNetText2Img |
|
- - ControlNetImg2Img |
|
- - IpAdapterText2Img |
|
- **Fast ControlNet Hot Swap**: Can hot swap a ControlNet model weights within 0.6s |
|
- **Fast LoRA Hot Swap**: Can hot swap a Lora within 0.14s |
|
- 100% likeness to diffusers output |
|
- Supported Devices: Any GPU with SM version >= 80. For example, Nvidia Nvidia Ampere architecture (A2, A10, A16, A30, A40, A100), RTX 4090, 3080 and etc. |
|
|
|
## Speed |
|
|
|
### test environment |
|
|
|
- Device: Nvidia A100 40G |
|
- Nvidia driver version: 525.105.17 |
|
- Nvidia cuda version: 12.0 |
|
- Percision:fp16 |
|
- Steps: 20 |
|
- Sampler: EulerA |
|
|
|
### SD1.5 Text2Img Performance |
|
![Alt text](images/sd_txt2img.png) |
|
|
|
### SD1.5 ControlNet-Text2Img Performance |
|
![Alt text](images/sd_controlnet_txt2img.png) |
|
|
|
### SDXL Text2Img Performance |
|
![Alt text](images/sdxl_txt2img.png) |
|
|
|
### SDXL ControlNet-Text2Img Performance |
|
![Alt text](images/sdxl_controlnet_txt2img.png) |
|
|
|
### SD Model Load Performance |
|
![Alt text](images/model_load_performance.png) |
|
|
|
## Model Sources |
|
|
|
SD1.5 |
|
- **Checkpoint:** https://civitai.com/models/7371/rev-animated |
|
- **ControlNet:** https://huggingface.co/lllyasviel/sd-controlnet-canny |
|
- **Lora:** https://civitai.com/models/18323?modelVersionId=46846 |
|
|
|
SDXL |
|
- **Checkpoint:** https://civitai.com/models/43977?modelVersionId=227916 |
|
- **ControlNet:** https://huggingface.co/diffusers/controlnet-canny-sdxl-1.0 |
|
- **Lora:** https://civitai.com/models/245889/dissolve-style-lora-15sdxl |
|
|
|
## SD1.5 Text2Img Uses |
|
|
|
```python |
|
import torch |
|
import time |
|
|
|
from lyrasd_model import LyraSdTxt2ImgPipeline |
|
|
|
# 存放模型文件的路径,应该包含一下结构(和diffusers一致): |
|
# 1. clip 模型 |
|
# 2. 转换好的优化后的 unet 模型,放入其中的 unet_bins 文件夹 |
|
# 3. vae 模型 |
|
# 4. scheduler 配置 |
|
|
|
# LyraSD 的 C++ 编译动态链接库,其中包含 C++ CUDA 计算的细节 |
|
lib_path = "./lyrasd_model/lyrasd_lib/libth_lyrasd_cu12_sm80.so" |
|
model_path = "./models/rev-animated" |
|
lora_path = "./models/xiaorenshu.safetensors" |
|
|
|
torch.classes.load_library(lib_path) |
|
|
|
# 构建 Txt2Img 的 Pipeline |
|
model = LyraSdTxt2ImgPipeline() |
|
|
|
model.reload_pipe(model_path) |
|
|
|
# load lora |
|
# lora model path, name,lora strength |
|
model.load_lora_v2(lora_path, "xiaorenshu", 0.4) |
|
|
|
# 准备应用的输入和超参数 |
|
prompt = "a cat, cute, cartoon, concise, traditional, chinese painting, Tang and Song Dynasties, masterpiece, 4k, 8k, UHD, best quality" |
|
negative_prompt = "(((horrible))), (((scary))), (((naked))), (((large breasts))), high saturation, colorful, human:2, body:2, low quality, bad quality, lowres, out of frame, duplicate, watermark, signature, text, frames, cut, cropped, malformed limbs, extra limbs, (((missing arms))), (((missing legs)))" |
|
height, width = 512, 512 |
|
steps = 20 |
|
guidance_scale = 7 |
|
generator = torch.Generator().manual_seed(123) |
|
num_images = 1 |
|
|
|
start = time.perf_counter() |
|
# 推理生成 |
|
images = model(prompt, height, width, steps, |
|
guidance_scale, negative_prompt, num_images, |
|
generator=generator) |
|
print("image gen cost: ",time.perf_counter() - start) |
|
# 存储生成的图片 |
|
for i, image in enumerate(images): |
|
image.save(f"outputs/res_txt2img_lora_{i}.png") |
|
|
|
# unload lora, lora’s name, clear lora cache |
|
model.unload_lora_v2("xiaorenshu", True) |
|
``` |
|
|
|
## SDXL Text2Img Uses |
|
|
|
```python |
|
import torch |
|
import time |
|
|
|
from lyrasd_model import LyraSdXLTxt2ImgPipeline |
|
|
|
# 存放模型文件的路径,应该包含一下结构: |
|
# 1. clip 模型 |
|
# 2. 转换好的优化后的 unet 模型,放入其中的 unet_bins 文件夹 |
|
# 3. vae 模型 |
|
# 4. scheduler 配置 |
|
|
|
# LyraSD 的 C++ 编译动态链接库,其中包含 C++ CUDA 计算的细节 |
|
lib_path = "./lyrasd_model/lyrasd_lib/libth_lyrasd_cu12_sm80.so" |
|
model_path = "./models/helloworldSDXL20Fp16" |
|
lora_path = "./models/dissolve_sdxl.safetensors" |
|
|
|
torch.classes.load_library(lib_path) |
|
|
|
# 构建 Txt2Img 的 Pipeline |
|
model = LyraSdXLTxt2ImgPipeline() |
|
|
|
model.reload_pipe(model_path) |
|
|
|
# load lora |
|
# lora model path, name,lora strength |
|
model.load_lora_v2(lora_path, "dissolve_sdxl", 0.4) |
|
|
|
# 准备应用的输入和超参数 |
|
prompt = "a cat, cute, cartoon, concise, traditional, chinese painting, Tang and Song Dynasties, masterpiece, 4k, 8k, UHD, best quality" |
|
negative_prompt = "(((horrible))), (((scary))), (((naked))), (((large breasts))), high saturation, colorful, human:2, body:2, low quality, bad quality, lowres, out of frame, duplicate, watermark, signature, text, frames, cut, cropped, malformed limbs, extra limbs, (((missing arms))), (((missing legs)))" |
|
height, width = 512, 512 |
|
steps = 20 |
|
guidance_scale = 7 |
|
generator = torch.Generator().manual_seed(123) |
|
num_images = 1 |
|
|
|
start = time.perf_counter() |
|
# 推理生成 |
|
images = model( prompt, |
|
height=height, |
|
width=width, |
|
num_inference_steps=steps, |
|
num_images_per_prompt=1, |
|
guidance_scale=guidance_scale, |
|
negative_prompt=negative_prompt, |
|
generator=generator |
|
) |
|
print("image gen cost: ",time.perf_counter() - start) |
|
# 存储生成的图片 |
|
for i, image in enumerate(images): |
|
image.save(f"outputs/res_txt2img_xl_lora_{i}.png") |
|
|
|
# unload lora,参数为 lora 的名字,是否清除 lora 缓存 |
|
model.unload_lora_v2("dissolve_sdxl", True) |
|
``` |
|
|
|
## Demo output |
|
|
|
### Text2Img |
|
#### SD1.5 Text2Img |
|
![text2img_demo](./outputs/res_txt2img_0.png) |
|
|
|
#### SD1.5 Text2Img with Lora |
|
![text2img_demo](./outputs/res_txt2img_lora_0.png) |
|
|
|
#### SDXL Text2Img |
|
![text2img_demo](./outputs/res_sdxl_txt2img_0.png) |
|
|
|
#### SDXL Text2Img with Lora |
|
![text2img_demo](./outputs/res_txt2img_xl_lora_0.png) |
|
|
|
|
|
<!-- ### Img2Img |
|
|
|
#### Img2Img input |
|
<img src="https://chuangxin-research-1258344705.cos.ap-guangzhou.myqcloud.com/share/files/seaside_town.png?q-sign-algorithm=sha1&q-ak=AKIDBF6i7GCtKWS8ZkgOtACzX3MQDl37xYty&q-sign-time=1692601590;1865401590&q-key-time=1692601590;1865401590&q-header-list=&q-url-param-list=&q-signature=ca04ca92d990d94813029c0d9ef29537e5f4637c" alt="img2img input" width="512"/> |
|
|
|
#### Img2Img output |
|
![text2img_demo](./outputs/res_img2img_0.png) --> |
|
|
|
### ControlNet Text2Img |
|
|
|
#### Control Image |
|
![text2img_demo](./control_bird_canny.png) |
|
|
|
#### SD1.5 ControlNet Text2Img Output |
|
![text2img_demo](./outputs/res_controlnet_txt2img_0.png) |
|
|
|
#### SDXL ControlNet Text2Img Output |
|
![text2img_demo](./outputs/res_controlnet_sdxl_txt2img_0.png) |
|
|
|
|
|
## Docker Environment Recommendation |
|
|
|
- For Cuda 12.X: we recommend ```yibolu96/lyrasd_workspace:1.0.0``` |
|
|
|
```bash |
|
docker pull yibolu96/lyrasd_workspace:1.0.0 |
|
docker run --rm -it --gpus all -v ./:/lyraSD yibolu96/lyrasd_workspace:1.0.0 |
|
|
|
pip install -r requirements.txt |
|
python txt2img_demo.py |
|
``` |
|
|
|
## Citation |
|
``` bibtex |
|
@Misc{lyraSD_2024, |
|
author = {Kangjian Wu, Zhengtao Wang, Yibo Lu, Haoxiong Su, Sa Xiao, Bin Wu}, |
|
title = {lyraSD: Accelerating Stable Diffusion with best flexibility}, |
|
howpublished = {\url{https://huggingface.co/TMElyralab/lyraSD}}, |
|
year = {2024} |
|
} |
|
``` |
|
|
|
## Report bug |
|
- start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraSD/discussions |
|
- report bug with a `[bug]` mark in the title. |