ProteusSigma / README.md
dataautogpt3's picture
Update README.md
b94ee91 verified
|
raw
history blame
3.96 kB
metadata
license: apache-2.0
language:
  - en
base_model:
  - stabilityai/stable-diffusion-xl-base-1.0
pipeline_tag: text-to-image
tags:
  - art

SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements

  • 10k dataset proof of concept (completed)link

  • 200k+ dataset finetune (in testing/training)

  • 12M million dataset finetune (planned)

Example Outputs

Example Output 1 Example Output 2 Example Output 3 Example Output 4 Example Output 5

Example prompt: A digital illustration of a lich with long grey hair and beard, as a university professor wearing a formal suit and standing in front of a class, writing on a whiteboard. He holds a marker, writing complex equations or magical symbols on the whiteboard.

Example prompt 2: a Candid Photo of a real short grey alien peering around a corner while trying to hide from the viewer in a living room,real photography, fujifilm superia, full HD, taken on a Canon EOS R5 F1. 2 ISO100 35MM

Combined Proteus and Mobius datasets.

Recommended Inference Parameters

ComfyUI workflow

"sampler": "euler_ancestral", # Best results with Euler Ancestral

"scheduler": "normal", # Normal noise schedule

"steps": 28, # Optimal step count

"cfg": 7.5 # Classifier-free guidance scale

Model Details

  • Model Type: SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements
  • Base Model: stabilityai/stable-diffusion-xl-base-1.0
  • Training Dataset: 10,000 high-quality images
  • License: Apache 2.0

Key Features

  • Zero Terminal SNR (ZTSNR) implementation
  • Increased σ_max ≈ 20000.0 (NovelAI research)
  • High-resolution coherence enhancements
  • Tag-based CLIP weighting
  • VAE improvements

Technical Specifications

  • Noise Schedule: σ_max ≈ 20000.0 to σ_min ≈ 0.0292
  • Progressive Steps: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292]
  • Resolution Scaling: √(H×W)/1024

Training Details

Training Configuration

  • Learning Rate: 1e-6
  • Batch Size: 1
  • Gradient Accumulation Steps: 1
  • Optimizer: AdamW
  • Precision: bfloat16
  • VAE Finetuning: Enabled
  • VAE Learning Rate: 1e-6

CLIP Weight Configuration

  • Character Weight: 1.5
  • Style Weight: 1.2
  • Quality Weight: 0.8
  • Setting Weight: 1.0
  • Action Weight: 1.1
  • Object Weight: 0.9

Performance Improvements

  • 47% fewer artifacts at σ < 5.0
  • Stable composition at σ > 12.4
  • 31% better detail consistency
  • Improved color accuracy
  • Enhanced dark tone reproduction

Repository and Resources

Citation

@article{ossa2024improvements,
  title={Improvements to SDXL in NovelAI Diffusion V3},
  author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.},
  journal={arXiv preprint arXiv:2409.15997v2},
  year={2024}
}