Spaces:
Sleeping
title: Real-Time SD Turbo
emoji: ⚡️
colorFrom: gray
colorTo: indigo
sdk: docker
pinned: false
suggested_hardware: a10g-small
Real-Time Latent Consistency Model
This demo showcases Latent Consistency Model (LCM) using Diffusers with a MJPEG stream server. You can read more about LCM + LoRAs with diffusers here.
You need a webcam to run this demo. 🤗
See a collecting with live demos here
Running Locally
You need CUDA and Python 3.10, Node > 19, Mac with an M1/M2/M3 chip or Intel Arc GPU
Install
python -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt
cd frontend && npm install && npm run build && cd ..
python run.py --reload --pipeline controlnet
Pipelines
You can build your own pipeline following examples here here, don't forget to fuild the frontend first
cd frontend && npm install && npm run build && cd ..
LCM
Image to Image
python run.py --reload --pipeline img2img
LCM
Text to Image
python run.py --reload --pipeline txt2img
Image to Image ControlNet Canny
python run.py --reload --pipeline controlnet
LCM + LoRa
Using LCM-LoRA, giving it the super power of doing inference in as little as 4 steps. Learn more here or technical report
Image to Image ControlNet Canny LoRa
python run.py --reload --pipeline controlnetLoraSD15
or SDXL, note that SDXL is slower than SD15 since the inference runs on 1024x1024 images
python run.py --reload --pipeline controlnetLoraSDXL
Text to Image
python run.py --reload --pipeline txt2imgLora
or
python run.py --reload --pipeline txt2imgLoraSDXL
Setting environment variables
TIMEOUT
: limit user session timeoutSAFETY_CHECKER
: disabled if you want NSFW filter offMAX_QUEUE_SIZE
: limit number of users on current app instanceTORCH_COMPILE
: enable if you want to use torch compile for faster inference works well on A100 GPUs
USE_TAESD
: enable if you want to use Autoencoder Tiny
If you run using bash build-run.sh
you can set PIPELINE
variables to choose the pipeline you want to run
PIPELINE=txt2imgLoraSDXL bash build-run.sh
and setting environment variables
TIMEOUT=120 SAFETY_CHECKER=True MAX_QUEUE_SIZE=4 python run.py --reload --pipeline txt2imgLoraSDXL
If you're running locally and want to test it on Mobile Safari, the webserver needs to be served over HTTPS, or follow this instruction on my comment
openssl req -newkey rsa:4096 -nodes -keyout key.pem -x509 -days 365 -out certificate.pem
python run.py --reload --ssl-certfile=certificate.pem --ssl-keyfile=key.pem
Docker
You need NVIDIA Container Toolkit for Docker, defaults to `controlnet``
docker build -t lcm-live .
docker run -ti -p 7860:7860 --gpus all lcm-live
reuse models data from host to avoid downloading them again, you can change ~/.cache/huggingface
to any other directory, but if you use hugingface-cli locally, you can share the same cache
docker run -ti -p 7860:7860 -e HF_HOME=/data -v ~/.cache/huggingface:/data --gpus all lcm-live
or with environment variables
docker run -ti -e PIPELINE=txt2imgLoraSDXL -p 7860:7860 --gpus all lcm-live
Development Mode
python run.py --reload
Demo on Hugging Face
https://huggingface.co/spaces/radames/Real-Time-Latent-Consistency-Model