Confused about the need for CFG
Quoting the model card:
Since the distillation has been fine tuned out of the model, it uses classic CFG.
I'm not quite sure what are the implications. Does that mean this wasn't trained to understand "guidance" hyperparameter like flux Dev and Schnell (and I believe Pro too)? Does it work with CFG_scale set to 1 to avoid having to compute the negative prompt/unconditionnal predicted noise at each step?
@stduhpf No, flux dev uses something called distilled guidance which is different from normal guidance used by sdxl, sd1.5, pixart, and most models. Distilled guidance does not support negative prompts.
Flux schnell doesn’t even use distilled guidance or normal guidance.
This model is basically a heavy finetune of flux schnell to use normal guidance and also work with 20 steps or more similar flux dev. This allows for similar quality images to dev, have an Apache 2.0 license, and also work with negative prompts.
Got it! The Model card should definitely says those things.
So, if I understand correctly we should expect this model to take about twice the time per step compared to dev and schnell, and to require as many steps as dev?