File size: 5,823 Bytes
c19ca42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
SD.Next includes *experimental* support for additional model pipelines  
This includes support for additional models such as:

- **Stable Diffusion XL**
- **Kandinsky**
- **Deep Floyd IF**

And soon:

- **Shap-E**, **UniDiffuser**, **Consistency Models**, **Diffedit Zero-Shot**
- **Text2Video**, **Video2Video**, etc...

*This has been made possible by integration of [huggingface diffusers](https://huggingface.co/docs/diffusers/index) library with the help of huggingface team!*

## How to

Moved to [Installation](https://github.com/vladmandic/automatic/wiki/Installation) and [SDXL](https://github.com/vladmandic/automatic/wiki/SDXL)

## Integration

### Standard workflows  

- **txt2img**
- **img2img**
- **inpaint**
- **process**

### Model Access

- For standard **SD 1.5** and **SD 2.1** models, you can use either  
  standard *safetensor* models (single file) or *diffusers* models (folder structure)
- For additional models, you can use *diffusers* models only  
- You can download diffuser models directly from [Huggingface hub](https://huggingface.co/)  
  or use built-in model search & download in SD.Next: **UI -> Models -> Huggingface**
- Note that access to some models is gated  
  In which case, you need to accept model EULA and provide your huggingface token  
- When loading safetensors models, you must specify model pipeline type in:  
  **UI -> Settings -> Diffusers -> Pipeline**  
  When loading huggingface models, pipeline type is automatically detected  
- If you get this `Diffuser model downloaded error: model=stabilityai/stable-diffusion-etc [Errno 2] No such file or directory:`  
  you need to go to the HuggingFace page and accept the EULA for that model.

### Extra Networks

- Lora networks  
- Textual inversions (embeddings)  

Note that Lora and TI need are still model-specific, so you cannot use Lora trained on SD 1.5 on SD-XL  
(just like you couldn't do it on SD 2.1 model) - it needs to be trained for a specific model  

Support for SD-XL training is expected shortly  

### Diffuser Settings

- UI -> Settings -> Diffuser Settings  
  contains additional tunable parameters  

### Samplers

- Samplers (schedulers) are pipeline specific, so when running with diffuser backend, you'll see a different list of samplers
- UI -> Settings -> Sampler Settings shows different configurable parameters depending on backend  
- Recommended sampler for diffusers is **DEIS**

### Other

- Updated **System Info** tab with additional information
- Support for `lowvram` and `medvram` modes - Both work extremely well  
  Additional tunables are available in UI -> Settings -> Diffuser Settings  
- Support for both default **SDP** and **xFormers** cross-optimizations  
  Other cross-optimization methods are not available  
- **Extra Networks UI** will show available diffusers models  
- **CUDA model compile**  
  UI Settings -> Compute settings  
  Requires GPU with high VRAM  
  Diffusers recommend `reduce overhead` compile mode, but other methods are available as well  
  Fullgraph compile is possible (with sufficient vram) when using diffusers  
- Note that some CUDA compile modes only work on Linux

## SD-XL Notes

- [SD-XL Technical Report](https://github.com/Stability-AI/generative-models/blob/main/assets/sdxl_report.pdf)
- SD-XL model is designed as two-stage model  
  You can run SD-XL pipeline using just `base` model or load both `base` and `refiner` models  
  - `base`: Trained on images with variety of aspect ratios and uses OpenCLIP-ViT/G and CLIP-ViT/L for text encoding  
  - `refiner`: Trained to denoise small noise levels of high quality data and uses the OpenCLIP model  
  - Having both `base` model and `refiner` model loaded can require significant VRAM
  - If you want to use `refiner` model, it is advised to add `sd_model_refiner` to **quicksettings**  
  in UI Settings -> User Interface
- SD-XL model was trained on **1024px** images  
  You can use it with smaller sizes, but you will likely get better results with SD 1.5 models  
- SD-XL model NSFW filter has been turned off  

### Download SD-XL 1.0

1. Enter `stabilityai/stable-diffusion-xl-base-1.0` in *Select Model* and press *Download*
2. Enter `stabilityai/stable-diffusion-xl-refiner-1.0` in *Select Model* and press *Download*

## Limitations

- Any extension that requires access to model internals will likely not work when using diffusers backend  
  This for example includes standard extensions such as `ControlNet`, `MultiDiffusion`,  
  *Note: application will auto-disable incompatible built-in extensions when running in diffusers mode*  
- Explicit `refiner` as postprocessing is not yet implemented  
- Hypernetworks  
- Limited callbacks support for scripts/extensions: additional callbacks will be added as needed  

## Performance

Comparison of original stable diffusion pipeline and diffusers pipeline when using standard SD 1.5 model  
Performance is measured for `batch-size` 1, 2, 4, 8 16  

| pipeline | performance it/s | memory cpu/gpu |
| --- | --- | --- |
| original | 7.99 / 7.93 / 8.83 / 9.14 / 9.2 | 6.7 / 7.2 |
| original medvram | 6.23 / 7.16 / 8.41 / 9.24 / 9.68 | 8.4 / 6.8 |
| original lowvram | 1.05 / 1.94 / 3.2 / 4.81 / 6.46 | 8.8 / 5.2 |
| diffusers | 9 / 7.4 / 8.2 / 8.4 / 7.0 | 4.3 / 9.0 |
| diffusers medvram | 7.5 / 6.7 / 7.5 / 7.8 / 7.2 | 6.6 / 8.2 |
| diffusers lowvram | 7.0 / 7.0 / 7.4 / 7.7 / 7.8 | 4.3 / 7.2 |
| diffusers with safetensors | 8.9 / 7.3 / 8.1 / 8.4 / 7.1 | 5.9 / 9.0 |

Notes:

- Test environment: nVidia RTX 3060 GPU, Torch 2.1-nightly with CUDA 12.1, Cross-optimization: SDP
- All being equal, diffusers seem to:
  - Use slightly less RAM and more VRAM
  - Have highly efficient medvram/lowvram equivalents which don't lose a lot of performance  
  - Faster on smaller batch sizes, slower on larger batch sizes