|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- stabilityai/stable-diffusion-xl-base-1.0 |
|
pipeline_tag: text-to-image |
|
tags: |
|
- art |
|
--- |
|
# SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements |
|
|
|
- [x] 10k dataset proof of concept (completed)[link](https://huggingface.co/dataautogpt3/ProteusSigma) |
|
|
|
- [ ] 200k+ dataset finetune (in testing/training) |
|
|
|
- [ ] 12M million dataset finetune (planned) |
|
|
|
## Example Outputs |
|
|
|
<div style="display: flex; flex-wrap: wrap; gap: 10px; justify-content: center;"> |
|
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example.png" width="256" alt="Example Output 1"/> |
|
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example2.png" width="256" alt="Example Output 2"/> |
|
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example3.png" width="256" alt="Example Output 3"/> |
|
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example4.png" width="256" alt="Example Output 4"/> |
|
<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example5.png" width="256" alt="Example Output 5"/> |
|
</div> |
|
|
|
Example prompt: `A digital illustration of a lich with long grey hair and beard, as a university professor wearing a formal suit and standing in front of a class, writing on a whiteboard. He holds a marker, writing complex equations or magical symbols on the whiteboard.` |
|
|
|
Example prompt 2: `a Candid Photo of a real short grey alien peering around a corner while trying to hide from the viewer in a living room,real photography, fujifilm superia, full HD, taken on a Canon EOS R5 F1. 2 ISO100 35MM` |
|
|
|
|
|
|
|
# Combined Proteus and Mobius datasets. |
|
|
|
# Recommended Inference Parameters |
|
|
|
|
|
[ComfyUI workflow](https://huggingface.co/dataautogpt3/sdxl-ztsnr-sigma-10k/blob/main/ComfyUI-test10k.json) |
|
|
|
"sampler": "euler_ancestral", # Best results with Euler Ancestral |
|
|
|
"scheduler": "normal", # Normal noise schedule |
|
|
|
"steps": 28, # Optimal step count |
|
|
|
"cfg": 7.5 # Classifier-free guidance scale |
|
|
|
## Model Details |
|
|
|
- **Model Type:** SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements |
|
- **Base Model:** stabilityai/stable-diffusion-xl-base-1.0 |
|
- **Training Dataset:** 10,000 high-quality images |
|
- **License:** Apache 2.0 |
|
|
|
## Key Features |
|
|
|
- Zero Terminal SNR (ZTSNR) implementation |
|
- Increased σ_max ≈ 20000.0 (NovelAI research) |
|
- High-resolution coherence enhancements |
|
- Tag-based CLIP weighting |
|
- VAE improvements |
|
|
|
### Technical Specifications |
|
|
|
- **Noise Schedule**: σ_max ≈ 20000.0 to σ_min ≈ 0.0292 |
|
- **Progressive Steps**: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292] |
|
- **Resolution Scaling**: √(H×W)/1024 |
|
|
|
## Training Details |
|
|
|
### Training Configuration |
|
- **Learning Rate:** 1e-6 |
|
- **Batch Size:** 1 |
|
- **Gradient Accumulation Steps:** 1 |
|
- **Optimizer:** AdamW |
|
- **Precision:** bfloat16 |
|
- **VAE Finetuning:** Enabled |
|
- **VAE Learning Rate:** 1e-6 |
|
|
|
### CLIP Weight Configuration |
|
- **Character Weight:** 1.5 |
|
- **Style Weight:** 1.2 |
|
- **Quality Weight:** 0.8 |
|
- **Setting Weight:** 1.0 |
|
- **Action Weight:** 1.1 |
|
- **Object Weight:** 0.9 |
|
|
|
|
|
## Performance Improvements |
|
|
|
- 47% fewer artifacts at σ < 5.0 |
|
- Stable composition at σ > 12.4 |
|
- 31% better detail consistency |
|
- Improved color accuracy |
|
- Enhanced dark tone reproduction |
|
|
|
## Repository and Resources |
|
|
|
- **GitHub Repository:** [SDXL-Training-Improvements](https://github.com/DataCTE/SDXL-Training-Improvements) |
|
- **Training Code:** Available in the repository |
|
- **Documentation:** [Implementation Details](https://github.com/DataCTE/SDXL-Training-Improvements/blob/main/README.md) |
|
- **Issues and Support:** [GitHub Issues](https://github.com/DataCTE/SDXL-Training-Improvements/issues) |
|
|
|
## Citation |
|
|
|
```bibtex |
|
@article{ossa2024improvements, |
|
title={Improvements to SDXL in NovelAI Diffusion V3}, |
|
author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.}, |
|
journal={arXiv preprint arXiv:2409.15997v2}, |
|
year={2024} |
|
} |
|
``` |