ProteusSigma / README.md

Update README.md

b94ee91 verified 3 months ago

3.96 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- stabilityai/stable-diffusion-xl-base-1.0
	pipeline_tag: text-to-image
	tags:
	- art
	---
	# SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements

	- [x] 10k dataset proof of concept (completed)[link](https://huggingface.co/dataautogpt3/ProteusSigma)

	- [ ] 200k+ dataset finetune (in testing/training)

	- [ ] 12M million dataset finetune (planned)

	## Example Outputs

	<div style="display: flex; flex-wrap: wrap; gap: 10px; justify-content: center;">
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example.png" width="256" alt="Example Output 1"/>
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example2.png" width="256" alt="Example Output 2"/>
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example3.png" width="256" alt="Example Output 3"/>
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example4.png" width="256" alt="Example Output 4"/>
	<img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example5.png" width="256" alt="Example Output 5"/>
	</div>

	Example prompt: `A digital illustration of a lich with long grey hair and beard, as a university professor wearing a formal suit and standing in front of a class, writing on a whiteboard. He holds a marker, writing complex equations or magical symbols on the whiteboard.`

	Example prompt 2: `a Candid Photo of a real short grey alien peering around a corner while trying to hide from the viewer in a living room,real photography, fujifilm superia, full HD, taken on a Canon EOS R5 F1. 2 ISO100 35MM`



	# Combined Proteus and Mobius datasets.

	# Recommended Inference Parameters


	[ComfyUI workflow](https://huggingface.co/dataautogpt3/sdxl-ztsnr-sigma-10k/blob/main/ComfyUI-test10k.json)

	"sampler": "euler_ancestral", # Best results with Euler Ancestral

	"scheduler": "normal", # Normal noise schedule

	"steps": 28, # Optimal step count

	"cfg": 7.5 # Classifier-free guidance scale

	## Model Details

	- Model Type: SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements
	- Base Model: stabilityai/stable-diffusion-xl-base-1.0
	- Training Dataset: 10,000 high-quality images
	- License: Apache 2.0

	## Key Features

	- Zero Terminal SNR (ZTSNR) implementation
	- Increased σ_max ≈ 20000.0 (NovelAI research)
	- High-resolution coherence enhancements
	- Tag-based CLIP weighting
	- VAE improvements

	### Technical Specifications

	- Noise Schedule: σ_max ≈ 20000.0 to σ_min ≈ 0.0292
	- Progressive Steps: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292]
	- Resolution Scaling: √(H×W)/1024

	## Training Details

	### Training Configuration
	- Learning Rate: 1e-6
	- Batch Size: 1
	- Gradient Accumulation Steps: 1
	- Optimizer: AdamW
	- Precision: bfloat16
	- VAE Finetuning: Enabled
	- VAE Learning Rate: 1e-6

	### CLIP Weight Configuration
	- Character Weight: 1.5
	- Style Weight: 1.2
	- Quality Weight: 0.8
	- Setting Weight: 1.0
	- Action Weight: 1.1
	- Object Weight: 0.9


	## Performance Improvements

	- 47% fewer artifacts at σ < 5.0
	- Stable composition at σ > 12.4
	- 31% better detail consistency
	- Improved color accuracy
	- Enhanced dark tone reproduction

	## Repository and Resources

	- GitHub Repository: [SDXL-Training-Improvements](https://github.com/DataCTE/SDXL-Training-Improvements)
	- Training Code: Available in the repository
	- Documentation: [Implementation Details](https://github.com/DataCTE/SDXL-Training-Improvements/blob/main/README.md)
	- Issues and Support: [GitHub Issues](https://github.com/DataCTE/SDXL-Training-Improvements/issues)

	## Citation

	```bibtex
	@article{ossa2024improvements,
	title={Improvements to SDXL in NovelAI Diffusion V3},
	author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.},
	journal={arXiv preprint arXiv:2409.15997v2},
	year={2024}
	}
	```