PseudoTerminal X commited on
Commit
de5387c
1 Parent(s): 442ca71

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -18
README.md CHANGED
@@ -2,22 +2,22 @@
2
  license: openrail++
3
  ---
4
 
5
- # Terminus XL Gamma (v2 preview)
6
 
7
  ## Model Details
8
 
9
  ### Model Description
10
 
11
- Terminus XL Gamma is a new state-of-the-art latent diffusion model that uses zero-terminal SNR noise schedule and velocity prediction objective at training and inference time.
12
 
13
- Terminus is based on a similar architecture to SDXL, and has the same layout. It has been trained on fewer steps with very high quality data captions via COCO and Midjourney.
14
 
15
- This model will not be capable of as many concepts as SDXL, and some subjects will simply look very bad.
16
 
17
- The objective of this model was to use v-prediction and min-SNR gamma loss to efficiently train a full zero-terminal SNR model on a single A100-80G.
18
 
19
 
20
- - **Fine-tuned from:** ptx0/terminus-xl-gamma-v1
21
  - **Developed by:** pseudoterminal X (@bghira)
22
  - **Funded by:** pseudoterminal X (@bghira)
23
  - **Model type:** Latent Diffusion
@@ -32,13 +32,17 @@ The objective of this model was to use v-prediction and min-SNR gamma loss to ef
32
 
33
  ### Direct Use
34
 
35
- Terminus XL Gamma can be used for generating high-quality images given text prompts. It should particularly excel at inpainting tasks, where a zero-terminal SNR noise schedule allows it to more effectively retain contrast.
 
 
36
 
37
  The model can be utilized in creative industries such as art, advertising, and entertainment to create visually appealing content.
38
 
39
  ### Downstream Use
40
 
41
- Terminus XL Gamma can be fine-tuned for specific tasks such as image super-resolution, style transfer, and more.
 
 
42
 
43
  ### Out-of-Scope Use
44
 
@@ -58,27 +62,27 @@ Users should be cautious of potential biases in the generated images and thoroug
58
 
59
  This model's success largely depended on a somewhat small collection of very high quality data samples.
60
 
61
- * LAION-HD, filtered down to EXIF samples without watermarks. Luminance value of samples capped to 100 (.5).
62
- * Midjourney 5.2 dataset `ptx0/mj-general` with zero filtration.
 
 
63
 
64
  ### Training Procedure
65
 
66
  #### Preprocessing
67
 
68
- Most of the existing process for terminus-xl-gamma-v1 was followed, with the exception of training extensively on cropped images using SDXL's crop coordinates to improve fine details.
69
-
70
- No images were upsampled during this training session. Images were downsampled using LANCZOS instead of BICUBIC filters to attain higher image fidelity and maintain more image context for the model to learn from.
71
 
72
- Only high-quality photos were used in this training session, greatly improving the realism qualities.
73
 
74
- ~770,000 images were used for this training run.
75
 
76
  #### Training Hyperparameters
77
 
78
  - **Training regime:** bf16 mixed precision
79
- - **Learning rate:** \(4 \times 10^{-7}\) to \(8 \times 10^{-7}\), cosine schedule
80
- - **Epochs:** 60
81
- - **Batch size:** 24 * 15 = 360
82
 
83
  #### Speeds, Sizes, Times
84
 
 
2
  license: openrail++
3
  ---
4
 
5
+ # Terminus XL Otaku (v1 preview)
6
 
7
  ## Model Details
8
 
9
  ### Model Description
10
 
11
+ Terminus XL Otaku is a latent diffusion model that uses zero-terminal SNR noise schedule and velocity prediction objective at training and inference time.
12
 
13
+ Terminus is a new state-of-the-art model family based on SDXL's architecture, and is compatible with (most) SDXL pipelines.
14
 
15
+ For Terminus Otaku (this model), the training data is exclusively anime/celshading/3D renders and other hand-drawn or synthetic art styles.
16
 
17
+ The objective of this model was to continue the use of v-prediction objective and min-SNR gamma loss to adapt Terminus Gamma v2's outputs to a more artistic style.
18
 
19
 
20
+ - **Fine-tuned from:** ptx0/terminus-xl-gamma-v2
21
  - **Developed by:** pseudoterminal X (@bghira)
22
  - **Funded by:** pseudoterminal X (@bghira)
23
  - **Model type:** Latent Diffusion
 
32
 
33
  ### Direct Use
34
 
35
+ Terminus XL Otaku can be used for generating high-quality images given text prompts.
36
+
37
+ It should particularly excel at inpainting tasks for animated subject matter, where a zero-terminal SNR noise schedule allows it to more effectively retain contrast.
38
 
39
  The model can be utilized in creative industries such as art, advertising, and entertainment to create visually appealing content.
40
 
41
  ### Downstream Use
42
 
43
+ Terminus XL Otaku can be fine-tuned for specific tasks such as image super-resolution, style transfer, and more.
44
+
45
+ However, it's recommended that the v1 preview not be used for fine-tuning until it is fully released, as any structural issues will hopefully be resolved by then.
46
 
47
  ### Out-of-Scope Use
48
 
 
62
 
63
  This model's success largely depended on a somewhat small collection of very high quality data samples.
64
 
65
+ * Indiscriminate use of NijiJourney outputs.
66
+ * Midjourney 5.2 outputs that mention anime styles in their tags.
67
+ * Niji and MJ Showcase images that were re-captioned using CogVLM.
68
+ * Anchor data of real human subjects in a small (10%) ratio to the animated material, to retain coherence.
69
 
70
  ### Training Procedure
71
 
72
  #### Preprocessing
73
 
74
+ This model is (so far) trained exclusively on cropped images using SDXL's crop coordinates to improve fine details.
 
 
75
 
76
+ No images were upsampled or downsampled during this training session. Instead, random crops (or unaltered 1024px square images) were used in lieu.
77
 
78
+ ~50,000 images were used for this training run with continuous collection throughout the process, making it difficult to ascertain how many exact images were used.
79
 
80
  #### Training Hyperparameters
81
 
82
  - **Training regime:** bf16 mixed precision
83
+ - **Learning rate:** \(1 \times 10^{-7}\) to \(1 \times 10^{-6}\), cosine schedule
84
+ - **Epochs:** 11
85
+ - **Batch size:** 12 * 8 = 96
86
 
87
  #### Speeds, Sizes, Times
88