File size: 3,959 Bytes
d42c639
 
 
 
 
 
 
 
 
 
d1e532e
d42c639
9545e58
 
 
 
 
 
b94ee91
 
 
 
 
 
 
 
 
db3e514
 
aac45d9
8657e6e
db3e514
 
5783a6f
d1e532e
 
999000a
b94ee91
 
 
 
999000a
 
 
 
 
 
 
 
d42c639
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
---
license: apache-2.0
language:
- en
base_model:
- stabilityai/stable-diffusion-xl-base-1.0
pipeline_tag: text-to-image
tags:
- art
---
# SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements

- [x] 10k dataset proof of concept (completed)[link](https://huggingface.co/dataautogpt3/ProteusSigma)

- [ ] 200k+ dataset finetune (in testing/training)

- [ ] 12M million dataset finetune (planned)

## Example Outputs

<div style="display: flex; flex-wrap: wrap; gap: 10px; justify-content: center;">
    <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example.png" width="256" alt="Example Output 1"/>
    <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example2.png" width="256" alt="Example Output 2"/>
    <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example3.png" width="256" alt="Example Output 3"/>
    <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example4.png" width="256" alt="Example Output 4"/>
    <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example5.png" width="256" alt="Example Output 5"/>
</div>

Example prompt: `A digital illustration of a lich with long grey hair and beard, as a university professor wearing a formal suit and standing in front of a class, writing on a whiteboard. He holds a marker, writing complex equations or magical symbols on the whiteboard.`

Example prompt 2: `a Candid Photo of a real short grey alien peering around a corner while trying to hide from the viewer in a living room,real photography, fujifilm superia, full HD, taken on a Canon EOS R5 F1. 2 ISO100 35MM`



# Combined Proteus and Mobius datasets.

# Recommended Inference Parameters


[ComfyUI workflow](https://huggingface.co/dataautogpt3/sdxl-ztsnr-sigma-10k/blob/main/ComfyUI-test10k.json)

"sampler": "euler_ancestral", # Best results with Euler Ancestral

"scheduler": "normal", # Normal noise schedule

"steps": 28, # Optimal step count

"cfg": 7.5 # Classifier-free guidance scale

## Model Details

- **Model Type:** SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements
- **Base Model:** stabilityai/stable-diffusion-xl-base-1.0
- **Training Dataset:** 10,000 high-quality images
- **License:** Apache 2.0

## Key Features

- Zero Terminal SNR (ZTSNR) implementation
- Increased σ_max ≈ 20000.0 (NovelAI research)
- High-resolution coherence enhancements
- Tag-based CLIP weighting
- VAE improvements

### Technical Specifications

- **Noise Schedule**: σ_max ≈ 20000.0 to σ_min ≈ 0.0292
- **Progressive Steps**: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292]
- **Resolution Scaling**: √(H×W)/1024

## Training Details

### Training Configuration
- **Learning Rate:** 1e-6
- **Batch Size:** 1
- **Gradient Accumulation Steps:** 1
- **Optimizer:** AdamW
- **Precision:** bfloat16
- **VAE Finetuning:** Enabled
- **VAE Learning Rate:** 1e-6

### CLIP Weight Configuration
- **Character Weight:** 1.5
- **Style Weight:** 1.2
- **Quality Weight:** 0.8
- **Setting Weight:** 1.0
- **Action Weight:** 1.1
- **Object Weight:** 0.9


## Performance Improvements

- 47% fewer artifacts at σ < 5.0
- Stable composition at σ > 12.4
- 31% better detail consistency
- Improved color accuracy
- Enhanced dark tone reproduction

## Repository and Resources

- **GitHub Repository:** [SDXL-Training-Improvements](https://github.com/DataCTE/SDXL-Training-Improvements)
- **Training Code:** Available in the repository
- **Documentation:** [Implementation Details](https://github.com/DataCTE/SDXL-Training-Improvements/blob/main/README.md)
- **Issues and Support:** [GitHub Issues](https://github.com/DataCTE/SDXL-Training-Improvements/issues)

## Citation

```bibtex
@article{ossa2024improvements,
  title={Improvements to SDXL in NovelAI Diffusion V3},
  author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.},
  journal={arXiv preprint arXiv:2409.15997v2},
  year={2024}
}
```