Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,6 @@
|
|
1 |
---
|
2 |
-
|
3 |
-
|
4 |
-
{}
|
5 |
---
|
6 |
|
7 |
# SD-Turbo Model Card
|
@@ -9,18 +8,22 @@
|
|
9 |
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
![row01](output_tile.jpg)
|
11 |
SD-Turbo is a fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation.
|
|
|
|
|
12 |
|
13 |
## Model Details
|
14 |
|
15 |
### Model Description
|
16 |
-
|
17 |
-
|
18 |
-
|
|
|
|
|
19 |
|
20 |
- **Developed by:** Stability AI
|
21 |
- **Funded by:** Stability AI
|
22 |
- **Model type:** Generative text-to-image model
|
23 |
-
- **Finetuned from model:** [Stable Diffusion 1
|
24 |
|
25 |
### Model Sources
|
26 |
|
@@ -28,14 +31,19 @@ For research purposes, we recommend our `generative-models` Github repository (h
|
|
28 |
which implements the most popular diffusion frameworks (both training and inference).
|
29 |
|
30 |
- **Repository:** https://github.com/Stability-AI/generative-models
|
31 |
-
- **Paper:**
|
|
|
32 |
|
33 |
|
34 |
## Evaluation
|
35 |
-
![
|
36 |
-
|
37 |
-
|
38 |
-
|
|
|
|
|
|
|
|
|
39 |
|
40 |
## Uses
|
41 |
|
@@ -62,6 +70,7 @@ The model should not be used in any way that violates Stability AI's [Acceptable
|
|
62 |
## Limitations and Bias
|
63 |
|
64 |
### Limitations
|
|
|
65 |
- The generated images are of a fixed resolution (512x512 pix), and the model does not achieve perfect photorealism.
|
66 |
- The model cannot render legible text.
|
67 |
- Faces and people in general may not be generated properly.
|
@@ -74,7 +83,4 @@ The model is intended for research purposes only.
|
|
74 |
|
75 |
## How to Get Started with the Model
|
76 |
|
77 |
-
Check out https://github.com/Stability-AI/generative-models
|
78 |
-
|
79 |
-
|
80 |
-
|
|
|
1 |
---
|
2 |
+
pipeline_tag: text-to-image
|
3 |
+
inference: false
|
|
|
4 |
---
|
5 |
|
6 |
# SD-Turbo Model Card
|
|
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
![row01](output_tile.jpg)
|
10 |
SD-Turbo is a fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation.
|
11 |
+
We release SD-Turbo as a research artifact, and to study small, distilled text-to-image models. For increased quality and prompt understanding,
|
12 |
+
we recommend [SDXL-Turbo](https://huggingface.co/stabilityai/sdxl-turbo/).
|
13 |
|
14 |
## Model Details
|
15 |
|
16 |
### Model Description
|
17 |
+
SD-Turbo is a distilled version of [Stable Diffusion 2.1](https://huggingface.co/stabilityai/stable-diffusion-2-1), trained for real-time synthesis.
|
18 |
+
SD-Turbo is based on a novel training method called Adversarial Diffusion Distillation (ADD) (see the [technical report](https://stability.ai/research/adversarial-diffusion-distillation)), which allows sampling large-scale foundational
|
19 |
+
image diffusion models in 1 to 4 steps at high image quality.
|
20 |
+
This approach uses score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal and combines this with an
|
21 |
+
adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps.
|
22 |
|
23 |
- **Developed by:** Stability AI
|
24 |
- **Funded by:** Stability AI
|
25 |
- **Model type:** Generative text-to-image model
|
26 |
+
- **Finetuned from model:** [Stable Diffusion 2.1](https://huggingface.co/stabilityai/stable-diffusion-2-1)
|
27 |
|
28 |
### Model Sources
|
29 |
|
|
|
31 |
which implements the most popular diffusion frameworks (both training and inference).
|
32 |
|
33 |
- **Repository:** https://github.com/Stability-AI/generative-models
|
34 |
+
- **Paper:** https://stability.ai/research/adversarial-diffusion-distillation
|
35 |
+
- **Demo [for the bigger SDXL-Turbo]:** http://clipdrop.co/stable-diffusion-turbo
|
36 |
|
37 |
|
38 |
## Evaluation
|
39 |
+
![comparison1](image_quality_one_step.png)
|
40 |
+
![comparison2](prompt_alignment_one_step.png)
|
41 |
+
The charts above evaluate user preference for SD-Turbo over other single- and multi-step models.
|
42 |
+
SD-Turbo evaluated at a single step is preferred by human voters in terms of image quality and prompt following over LCM-XL evaluated at four (or fewer) steps.
|
43 |
+
In addition, we see that using four steps for SD-Turbo further improves performance.
|
44 |
+
**Note:** For increased quality, we recommend the bigger version [SDXL-Turbo](https://huggingface.co/stabilityai/sdxl-turbo/).
|
45 |
+
For details on the user study, we refer to the [research paper](https://stability.ai/research/adversarial-diffusion-distillation).
|
46 |
+
|
47 |
|
48 |
## Uses
|
49 |
|
|
|
70 |
## Limitations and Bias
|
71 |
|
72 |
### Limitations
|
73 |
+
- The quality and prompt alignment is lower than that of [SDXL-Turbo](https://huggingface.co/stabilityai/sdxl-turbo/).
|
74 |
- The generated images are of a fixed resolution (512x512 pix), and the model does not achieve perfect photorealism.
|
75 |
- The model cannot render legible text.
|
76 |
- Faces and people in general may not be generated properly.
|
|
|
83 |
|
84 |
## How to Get Started with the Model
|
85 |
|
86 |
+
Check out https://github.com/Stability-AI/generative-models
|
|
|
|
|
|