|
|
|
# FluffyRock Unbound v1.1 |
|
|
|
A finetune resumed from [Fluffyrock Unleashed v1.0](https://huggingface.co/RedRocket/Fluffyrock-Unleashed), with the following changes: |
|
|
|
### Technical changes: |
|
- Adaptive timestep weighting: Timesteps are weighted using a similar method to what the EDM2 paper used, according to the homoscedastic uncertainty of MSE loss on each timestep, thereby equalizing the contribution of each timestep. Loss weight was also conditioned on resolution in order to equalize the contribution of each resolution group. The overall effect of this is that the model is now very good at both high- and low-frequency details, and is not as biased towards blurry backgrounds. |
|
- EMA weights were assembled post-hoc using the method described in the EDM2 paper. The checkpoint shipped uses an EMA length sigma of 0.225. |
|
- Cross-attention masking was applied to extra completely empty blocks of CLIP token embeddings, making the model work better with short prompts. Previously, if an image had a short caption, it would be fed in similarly to if you had added `BREAK BREAK BREAK` to the prompt in A1111, which caused the model to depend on those extra blocks and made it produce better images with 225 tokens of input. The model is no longer dependent on this. |
|
- Optimizer replaced with schedule-free AdamW, and weight decay was turned off in bias layers, which has greatly stabilized training. |
|
|
|
### Data input changes: |
|
- Low resolution images were removed from higher-resolution buckets. This resulted in removal of approximately 1/3 of images from the highest resolution group. From our testing, we have observed no negative impact on high res generation quality, and this should improve fine details on high res images. |
|
- The tokenizer used for training inputs was set up to never split tags down the middle. If a tag would go to the edge of the block, it will now be moved to the next block. This is similar to how most frontends behave. |
|
- Random dropout is now applied to implied tags. The overall effect of this change should be that more specific tags will be more powerful and less dependent on implied tags, but more general tags will still be present and usable. |
|
|
|
### Dataset Changes: |
|
- A sizeable overhaul of E621 tagging was done, removing several useless tags and renaming others. We are including new tag files that represent the current state of the dataset. |
|
|