English
art
Stable Diffusion

Model Card for lyraSD

We consider the Diffusers as the much more extendable framework for the SD ecosystem. Therefore, we have made a pivot to Diffusers, leading to a complete update of lyraSD.

lyraSD is currently the fastest Stable Diffusion model that can 100% align the outputs of Diffusers available, boasting an inference cost of only 0.36 seconds for a 512x512 image, accelerating the process up to 50% faster than the original version.

Among its main features are:

  • All Commonly used SD1.5 and SDXL pipelines
    • Text2Img
    • Img2Img
    • Inpainting
    • ControlNetText2Img
    • ControlNetImg2Img
    • IpAdapterText2Img
  • Fast ControlNet Hot Swap: Can hot swap a ControlNet model weights within 0.6s
  • Fast LoRA Hot Swap: Can hot swap a Lora within 0.14s
  • 100% likeness to diffusers output
  • Supported Devices: Any GPU with SM version >= 80. For example, Nvidia Nvidia Ampere architecture (A2, A10, A16, A30, A40, A100), RTX 4090, 3080 and etc.

Speed

test environment

  • Device: Nvidia A100 40G
  • Nvidia driver version: 525.105.17
  • Nvidia cuda version: 12.0
  • Percision:fp16
  • Steps: 20
  • Sampler: EulerA

SD1.5 Text2Img Performance

Alt text

SD1.5 ControlNet-Text2Img Performance

Alt text

SDXL Text2Img Performance

Alt text

SDXL ControlNet-Text2Img Performance

Alt text

SD Model Load Performance

Alt text

Model Sources

SD1.5

SDXL

SD1.5 Text2Img Uses

import torch
import time

from lyrasd_model import LyraSdTxt2ImgPipeline

# 存放模型文件的路径,应该包含一下结构(和diffusers一致):
#   1. clip 模型
#   2. 转换好的优化后的 unet 模型,放入其中的 unet_bins 文件夹
#   3. vae 模型
#   4. scheduler 配置

# LyraSD 的 C++ 编译动态链接库,其中包含 C++ CUDA 计算的细节
lib_path = "./lyrasd_model/lyrasd_lib/libth_lyrasd_cu12_sm80.so"
model_path = "./models/rev-animated"
lora_path = "./models/xiaorenshu.safetensors"

torch.classes.load_library(lib_path)

# 构建 Txt2Img 的 Pipeline
model = LyraSdTxt2ImgPipeline()

model.reload_pipe(model_path)

# load lora
# lora model path, name,lora strength
model.load_lora_v2(lora_path, "xiaorenshu", 0.4)

# 准备应用的输入和超参数
prompt = "a cat, cute, cartoon, concise, traditional, chinese painting, Tang and Song Dynasties, masterpiece, 4k, 8k, UHD, best quality"
negative_prompt = "(((horrible))), (((scary))), (((naked))), (((large breasts))), high saturation, colorful, human:2, body:2, low quality, bad quality, lowres, out of frame, duplicate, watermark, signature, text, frames, cut, cropped, malformed limbs, extra limbs, (((missing arms))), (((missing legs)))"
height, width = 512, 512
steps = 20
guidance_scale = 7
generator = torch.Generator().manual_seed(123)
num_images = 1

start = time.perf_counter()
# 推理生成
images = model(prompt, height, width, steps,
        guidance_scale, negative_prompt, num_images,
        generator=generator)
print("image gen cost: ",time.perf_counter() - start)
# 存储生成的图片
for i, image in enumerate(images):
    image.save(f"outputs/res_txt2img_lora_{i}.png")

# unload lora,      lora’s name,  clear lora cache
model.unload_lora_v2("xiaorenshu", True)

SDXL Text2Img Uses

import torch
import time

from lyrasd_model import LyraSdXLTxt2ImgPipeline

# 存放模型文件的路径,应该包含一下结构:
#   1. clip 模型
#   2. 转换好的优化后的 unet 模型,放入其中的 unet_bins 文件夹
#   3. vae 模型
#   4. scheduler 配置

# LyraSD 的 C++ 编译动态链接库,其中包含 C++ CUDA 计算的细节
lib_path = "./lyrasd_model/lyrasd_lib/libth_lyrasd_cu12_sm80.so"
model_path = "./models/helloworldSDXL20Fp16"
lora_path = "./models/dissolve_sdxl.safetensors"

torch.classes.load_library(lib_path)

# 构建 Txt2Img 的 Pipeline
model = LyraSdXLTxt2ImgPipeline()

model.reload_pipe(model_path)

# load lora
# lora model path, name,lora strength
model.load_lora_v2(lora_path, "dissolve_sdxl", 0.4)

# 准备应用的输入和超参数
prompt = "a cat, cute, cartoon, concise, traditional, chinese painting, Tang and Song Dynasties, masterpiece, 4k, 8k, UHD, best quality"
negative_prompt = "(((horrible))), (((scary))), (((naked))), (((large breasts))), high saturation, colorful, human:2, body:2, low quality, bad quality, lowres, out of frame, duplicate, watermark, signature, text, frames, cut, cropped, malformed limbs, extra limbs, (((missing arms))), (((missing legs)))"
height, width = 512, 512
steps = 20
guidance_scale = 7
generator = torch.Generator().manual_seed(123)
num_images = 1

start = time.perf_counter()
# 推理生成
images = model( prompt,
                height=height,
                width=width,
                num_inference_steps=steps,
                num_images_per_prompt=1,
                guidance_scale=guidance_scale,
                negative_prompt=negative_prompt,
                generator=generator
                )
print("image gen cost: ",time.perf_counter() - start)
# 存储生成的图片
for i, image in enumerate(images):
    image.save(f"outputs/res_txt2img_xl_lora_{i}.png")

# unload lora,参数为 lora 的名字,是否清除 lora 缓存
model.unload_lora_v2("dissolve_sdxl", True)

Demo output

Text2Img

SD1.5 Text2Img

text2img_demo

SD1.5 Text2Img with Lora

text2img_demo

SDXL Text2Img

text2img_demo

SDXL Text2Img with Lora

text2img_demo

ControlNet Text2Img

Control Image

text2img_demo

SD1.5 ControlNet Text2Img Output

text2img_demo

SDXL ControlNet Text2Img Output

text2img_demo

Docker Environment Recommendation

  • For Cuda 12.X: we recommend yibolu96/lyrasd_workspace:1.0.0
docker pull yibolu96/lyrasd_workspace:1.0.0
docker run --rm -it --gpus all -v ./:/lyraSD yibolu96/lyrasd_workspace:1.0.0

pip install -r requirements.txt
python txt2img_demo.py

Citation

@Misc{lyraSD_2024,
  author =       {Kangjian Wu, Zhengtao Wang, Yibo Lu, Haoxiong Su, Sa Xiao, Bin Wu},
  title =        {lyraSD: Accelerating Stable Diffusion with best flexibility},
  howpublished = {\url{https://huggingface.co/TMElyralab/lyraSD}},
  year =         {2024}
}

Report bug

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.