File size: 3,282 Bytes
14343c2
 
 
 
 
 
 
 
 
 
9149666
 
 
 
 
3e0648c
 
 
 
95df44d
 
3e0648c
6ea2dd5
3e0648c
 
 
4a0e9cb
3e0648c
 
 
1b4e8ad
3e0648c
 
 
 
 
2c971cb
1b4e8ad
3e0648c
8c91d18
151c9aa
3e0648c
98b75d0
151c9aa
3e0648c
 
 
98b75d0
1b4e8ad
 
 
3e0648c
 
1b4e8ad
 
3e0648c
 
 
 
 
 
 
 
 
1b4e8ad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
license: other
license_name: fair-ai-public-license-1.0-sd
license_link: https://freedevproject.org/faipl-1.0-sd/
language:
- en
base_model:
- Laxhar/noobai-XL-1.0
pipeline_tag: text-to-image
library_name: diffusers
tags:
- safetensors
- diffusers
- stable-diffusion
- stable-diffusion-xl
---
# V-Prediction Loss Weighting Test

## Notice
This repository contains personal experimental records. No guarantees are made regarding accuracy or reproducibility.  
**These models are for verification purposes only and is not intended for general use.**
## Overview
This repository is a test project comparing different loss weighting schemes for Stable Diffusion v-prediction training.

## Environment
- [sd-scripts](https://github.com/kohya-ss/sd-scripts) dev branch
  - Commit hash: [6adb69b] + Modified

## Test Cases

This repository includes test models using different weighting schemes:

1. **test_normal_weight**
   - Baseline model using standard weighting

2. **test_edm2_weighting**
   - New loss weighting scheme
   - implementation by A

3. **test_min_snr_1**
   - Baseline model with `--min_snr_gamma = 1`

4. **test_debias_scale-like**
   - Baseline model with additional parameters:
     - `--debiased_estimation_loss`
     - `--scale_v_pred_loss_like_noise_pred`

5. **test_edm2_weight_new**
   - New loss weighting scheme
   - Implementation by madman404

## Training Parameters
For detailed parameters, please refer to the `.toml` files in each model directory.
Each model uses sdxl_train.py in each model directory
(and sdxl_train.py and t.py for test_edm2_weighting, sdxl_train.py andlossweightMLP.py for test_edm2_weight_new)

Common parameters:
- Samples: 57,373
- Epochs: 3
- U-Net only
- Learning rate: 3.5e-6
- Batch size: 8
- Gradient accumulation steps: 4
- Optimizer: Adafactor (stochastic rounding)
- Training time: 13.5 GPU hours (RTX4090) per trial

## Dataset Information
The dataset used for testing consists of:
- ~53,000 images extracted from danbooru2023 based on specific artist styles (approximately 300 artists)
- ~4,000 carefully selected danbooru images for standardization

**Note**: As this dataset is a subset of my regular training data focused on specific artists, the model's generalization might be limited. A wildcard file (wildcard_style.txt) containing the list of included artists is provided for reference.

### Tag Format
The training follows the tag format from [Kohaku-XL-Epsilon](https://huggingface.co/KBlueLeaf/Kohaku-XL-Epsilon):
`<1girl/1boy/1other/...>, <character>, <series>, <artists>, <general tags>, <quality tags>, <year tags>, <meta tags>, <rating tags>`

### Style Prompts
The following style prompts from Kohaku-XL-Epsilon might be compatible (untested):
```
ask \(askzy\), torino aqua, migolu, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1)
```
```
ciloranko, maccha \(mochancc\), lobelia \(saclia\), migolu, 
ask \(askzy\), wanke, (jiu ye sang:1.1), (rumoon:0.9), (mizumi zumi:1.1)
```
```
shiro9jira, ciloranko, ask \(askzy\), (tianliang duohe fangdongye:0.8)
```
```
(azuuru:1.1), (torino aqua:1.2), (azuuru:1.1), kedama milk, 
fuzichoco, ask \(askzy\), chen bin, atdan, hito, mignon
```
```
ask \(askzy\), torino aqua, migolu
```


*This model card was written with the assistance of Claude 3.5 Sonnet.*