how did you train it?
basically how did you train the distillation out of it?
I generated 20k+ images with Flux Schnell using random prompts designed to cover a wide variety of styles and subjects. I begin training schnell on these images which gradually caused the distillation to break down. It has taken many iterations with training at a pretty low LR in order to attempt to preserve as much knowledge as possible and only break down the distillation. However, this extremely slow. I tested a few different things to speed it up and I found that training with CFG of 2-4, with a blank unconditional, seemed to drastically speed up the breakdown of the distillation. I trained with this until it appeared to converge. However, this leaves the model in a somewhat unstable state, so I then trained it without CFG to re-stabilize it. Farther training will be done without any CFG as it does not appear to be needed anymore. I plan to do more aesthetic tuning from here on out.
What are the methods of distillation? Is LADD or ADD used ?