NeuroSenko/senko-lora-ponyxl

[Open Grid] | [Open CivitAI]

Here I publish results of my experiments and my subjective opinion about hyperparams for PonyXL while I tried to make Senko-lora for that.

Training was done using 120 images, 20 epochs, 6600 steps totally. I checked only the last epochs.

Batch Size == 1
TE LR == UNet LR
I didn't use gradient checkpointing.

❓ <lora:senko_ds6_ponyxl_lr1_linear_prodigy_dim16_alpha8:1> overfit
❓ <lora:senko_ds6_ponyxl_lr1_linear_prodigy_dim32_alpha16:1> overfit

Prodigy works a bit worse with artists tags and style loras, but still look good. Probably it can be useful if you are fine with the default style lora produces.

❓ <lora:senko_ds6_ponyxl_lr1e-4_constant_adamw8_dim32_alpha16:1> overfit
❌ <lora:senko_ds6_ponyxl_lr1e-5_constant_adamw8_dim16_alpha1:1> doesn't work
✅ <lora:senko_ds6_ponyxl_lr1e-5_constant_adamw8_dim16_alpha8:1> OK
✅ <lora:senko_ds6_ponyxl_lr1e-5_constant_adamw8_dim32_alpha16:1> OK (published as senko-ponyxl-v2)

Adam 1e-4 bakes the style from dataset what is noticable on some grids. While I tried to make lora using LR == 1e-3, loss become equal to 1 at second epoch so I stropped training. But afair I used dim32/alpha16 which won't work with such high LR at all, so it was my mistake.

✅ <lora:senko_ds6_ponyxl_lr3e-4_constant_adafactor_dim16_alpha1:1> OK
✅ <lora:senko_ds6_ponyxl_lr3e-4_constant_adafactor_dim16_alpha8:1> OK
✅ <lora:senko_ds6_ponyxl_lr3e-4_constant_adafactor_dim32_alpha16:1> OK (published as senko-ponyxl-v1)

Adafactor LR == 3e-4 works fine with different dim/alpha params.

❓ <lora:senko_ds6_ponyxl_locon_lr1_linear_prodigy_dim16_alpha8_conv16_convalpha_8:1> breaks anatomy on complex concepts
❓ <lora:senko_ds6_ponyxl_locon_lr1_linear_prodigy_dim16_alpha8_conv32_convalpha_16:1> TE overfit
❓ <lora:senko_ds6_ponyxl_locon_lr1_linear_prodigy_dim32_alpha16_conv16_convalpha_8:1> TE overfit
❓ <lora:senko_ds6_ponyxl_locon_lr1_linear_prodigy_dim32_alpha16_conv32_convalpha_16:1> TE overfit

I didn't find good hyperparams for locon with prodigy optimizer - it breaks anatomy or doesn't care about prompt at all.

❌ <lora:senko_ds6_sdxl_lr1e-5_constant_adamw8_dim32_alpha16:1> doesn't work
❌ <lora:senko_ds6_sdxl_lr3e-4_constant_adafactor_dim32_alpha16:1> doesn't work
❌ <lora:senko_ds6_counterfeitxl_lr1e-5_constant_adamw8_dim32_alpha16:1> doesn't work

The lora I made using SDXL checkpoint doesn't work, the same for lora trained on CounterfeitXL.