license: apache-2.0
language:
- en
base_model:
- stabilityai/stable-diffusion-xl-base-1.0
pipeline_tag: text-to-image
tags:
- art
SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements
10k dataset proof of concept (completed)link
200k+ dataset finetune (in testing/training)
12M million dataset finetune (planned)
Example Outputs
Example prompt: A digital illustration of a lich with long grey hair and beard, as a university professor wearing a formal suit and standing in front of a class, writing on a whiteboard. He holds a marker, writing complex equations or magical symbols on the whiteboard.
Example prompt 2: a Candid Photo of a real short grey alien peering around a corner while trying to hide from the viewer in a living room,real photography, fujifilm superia, full HD, taken on a Canon EOS R5 F1. 2 ISO100 35MM
Combined Proteus and Mobius datasets.
Recommended Inference Parameters
"sampler": "euler_ancestral", # Best results with Euler Ancestral
"scheduler": "normal", # Normal noise schedule
"steps": 28, # Optimal step count
"cfg": 7.5 # Classifier-free guidance scale
Model Details
- Model Type: SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements
- Base Model: stabilityai/stable-diffusion-xl-base-1.0
- Training Dataset: 10,000 high-quality images
- License: Apache 2.0
Key Features
- Zero Terminal SNR (ZTSNR) implementation
- Increased σ_max ≈ 20000.0 (NovelAI research)
- High-resolution coherence enhancements
- Tag-based CLIP weighting
- VAE improvements
Technical Specifications
- Noise Schedule: σ_max ≈ 20000.0 to σ_min ≈ 0.0292
- Progressive Steps: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292]
- Resolution Scaling: √(H×W)/1024
Training Details
Training Configuration
- Learning Rate: 1e-6
- Batch Size: 1
- Gradient Accumulation Steps: 1
- Optimizer: AdamW
- Precision: bfloat16
- VAE Finetuning: Enabled
- VAE Learning Rate: 1e-6
CLIP Weight Configuration
- Character Weight: 1.5
- Style Weight: 1.2
- Quality Weight: 0.8
- Setting Weight: 1.0
- Action Weight: 1.1
- Object Weight: 0.9
Performance Improvements
- 47% fewer artifacts at σ < 5.0
- Stable composition at σ > 12.4
- 31% better detail consistency
- Improved color accuracy
- Enhanced dark tone reproduction
Repository and Resources
- GitHub Repository: SDXL-Training-Improvements
- Training Code: Available in the repository
- Documentation: Implementation Details
- Issues and Support: GitHub Issues
Citation
@article{ossa2024improvements,
title={Improvements to SDXL in NovelAI Diffusion V3},
author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.},
journal={arXiv preprint arXiv:2409.15997v2},
year={2024}
}