woodcut illustration
Model description
Trained on illustrations from Nouveau dictionnaire encyclopédique universel illustré, A.K.A. the Trousset encyclopedia.
FLUX.1 [dev] has decent performance with “woodcut illustration” already, but sometimes it uses mid-tones instead of proper black & white shading.
This LoRA is kinda scuffed but demonstrates some progress in influencing that style, or maybe just biasing the scene toward what is in the encyclopedia. It seems to like foliage.
Its effect may be more visible on something like Dedistilled-Mix or Fusion than stock FLUX.1 [dev].
Trigger words
Training captions began with the phrase “woodcut illustration.”
They also include “1880s, 19th century”; I haven't tested prompting those.
Download model
- trousset-landscapes.epoch31: From a more specific subset of 97 images with a consistent aesthetic; cityscapes and landscapes and a few other scenes with a similar mostly-full-frame style. Appears to be effective at encouraging finer lines for shading compared to the base model's rendering of woodcut illustration.
- woodcut-illustration.epoch8: First published attempt. Sourced from 336 images, trained for 8 epochs (2688 steps). Some effect but not great.
Weights for this model are available in Safetensors format.
Download them in the Files & versions tab.
Methodology
I chose to use illustrations from le Trousset because it could provide hundreds of images in a relatively consistent style. Source images were 1600px on the long side, I excluded those with more extreme aspect ratios (beyond 16:10), then resized to 1024px for RAM's sake.
Trained with kohya-ss/sd-scripts (2024-12-15). Some perhaps relevant settings:
- --optimizer_type adafactor --optimizer_args "relative_step=False" "scale_parameter=False" "warmup_init=False" --lr_scheduler constant_with_warmup --learning_rate 8e-4 --model_prediction_type raw --guidance_scale 1 --loss_type l2
- fluxgym defaults
- --network_dim 16 --network_alpha 16
- definitely helped to raise network_dim up from fluxgym's default to 12 or 16. I read somewhere that alpha should be set to the same, and raising it from the default 1.0 seems to have helped.
- --network_args "train_blocks=single" "train_single_block_indices=18-37" "verbose=True"
- In hopes that focusing on the higher half of the blocks would help prioritize small-scale detail over large structure. (adjusted to `9-37` in the `trousset-landscapes.epoch31` training.)
- --timestep_sampling shift --discrete_flow_shift 1.0
- fluxgym's default shift is 3.1. Unverified hunch that we want to bring this back down to focus on lower timesteps for small-scale stuff.
To accomodate 12 GB VRAM budget, used bfloat16 precision with the u-net at float8. The T5 encoder was not trained.
Potential
I can think of lots of things one could do to perhaps improve the result, though I don't have a good sense for which are most effective.
- More dataset curation. The images were relatively consistent, but the botanical illustrations are better than the mammals,
the engineering illustrations are different than the cityscapes, etc. I did some of this in the form of:
trousset-bugs-and-botany-sources.txt
: mostly flowers, close-up detail illustrations with no backgrounds.trousset-landscape-sources.txt
: cityscapes and landscapes and a few other scenes with a similar mostly-full-frame style.
- Dataset augmentation. (I haven't yet turned to any of the tricks like mirroring the images, but I'm not sure if more is better at this point.)
- Multi-resolution training. Flux is usually pretty good over a range of sizes, but this LoRA seems to suffer more when going below the megapixel size it was trained at. I think some discretion is necessary for how images are cropped or downsized to avoid losing the qualities of the lines in shading.
- More detailed captions? Less detailed captions?
- Include T5 encoder in training. (Requires a bigger server.)
- Include the autoencoder in training. (Seems relevant for styles with narrow high-contrast lines.)
- Esoteric hyperparameter stuff. Different layers/blocks/rank/alpha?
- Downloads last month
- 33
Model tree for keturn/woodcut-illustrations-Trousset-LoRA
Base model
ashen0209/Flux-Dev2Pro