Possible to work with 8GB VRAM and 16GB RAM?
Hey, is it possible to use this model with an RTX 2060 Super 8GB VRAM and 16 GB RAM? and the graphics card is being used for other things as well.
The goal of this project is to optimize the transformer without modifying model weights precision. And the problem is that T5 XXL (without quantization) also takes more than 8gb of VRAM
If you offload to system RAM, its not a problem, comfyUI can do that. And ofc you can use smaller quants, if needed.
Ive used this model with 12GB system RAM and 12GB VRAM, no problem.
And if needed one can use also smaller T5 XXL quants, at cost of some quality, but that is case with all quants, but if one uses for example t5 v1_1 Q8 then you should have enough memory for everything and quality is very close to full fat fp16 T5 XXL. Plus this model is fast even when offloaded into system RAM.
guide to run on <= 16gb VRAM
https://huggingface.co/Freepik/flux.1-lite-8B-alpha/discussions/11#6726621bba7d809f4ee6a9cc
It fits in 8GB with compute buffer for 1024x1024 images when quantized to q4_k.