codehappy/puzzlebox-xl · Hugging Face

Puzzle Box XL

A latent diffusion model (LDM) geared toward illustration, style composability, and sample variety. Addresses a few deficiencies with the SDXL base model.

Architecture: SD XL (base model is v1.0)
Training procedure: U-Net fully unfrozen, all-parameter continued pretraining at LR between 3e-8 and 3e-7 for 14,290,000 steps (at epoch 14, batch size 4).

Trained on the Puzzle Box dataset, a large collection of permissively licensed images from the public Internet (or generated by previous Puzzle Box models). Each image has from 3 to 15 different captions which are used interchangably during training. There are 8.2 million images and 54 million captions in the dataset.

The model is substantially better than the base SDXL model at producing images that look like film photographs, any kind of cartoon art, or old artist styles. It's also heavily tuned toward personal aesthetic preference.

Prompt adherence is unusually good; aesthetics are improved by human evaluation for generations between 1/4 and 1/2 megapixel in size. CFG scales between 2 and 7 can work well with Puzzle Box, experimenting with resolution or scale for your prompts is encouraged.

Model checkpoints currently available:

from epoch 14, 14290k training steps, 02 December 2024
from epoch 13, 11930k training steps, 15 August 2024
from epoch 12, 10570k training steps, 21 June 2024

This model has been trained carefully on top of the SDXL base, with a widely diverse training set at low learning rate. Accordingly, it should merge well with most other LDMs built off SDXL base. (Merging LDMs built off the same base is a form of transfer learning; you can add Puzzle Box concepts to other SDXL models this way. Spherical interpolation is best.) The captions used in training are also varied: you can prompt Puzzle Box XL using English sentences, or booru-style with lists of tags. (If you prompt booru-style, don't use underscores in your tags, replace those with spaces. Tags may be separated by any combination of whitespace or by commas.)

codehappy
/

puzzlebox-xl

Model tree for codehappy/puzzlebox-xl