Huge news for Kohya GUI - Now you can fully Fine Tune / DreamBooth FLUX Dev with as low as 6 GB GPUs without any quality loss compared to 48 GB GPUs - Moreover, Fine Tuning yields better results than any LoRA training could
LoRA Extraction The checkpoint sizes are 23.8 GB but you can extract LoRA with almost no loss quality - I made a research and public article / guide for this as well
Info This is just mind blowing. The recent improvements Kohya made for block swapping is just amazing.
Speeds are also amazing that you can see in image 2 - of course those values are based on my researched config and tested on RTX A6000 - same speed as almost RTX 3090
Also all trainings experiments are made at 1024x1024px. If you use lower resolution it will be lesser VRAM + faster speed
The VRAM usages would change according to your own configuration - likely speed as well
Moreover, Fine Tuning / DreamBooth yields better results than any LoRA could