Comparisons to FP8 e4m3fn ?
#3
by
NielsGx
- opened
Hi,
I'm currently using the FP8 e4m3fn variant (4.55GB) on a 16GB VRAM GPU.
Mainly interested to know if Q6_K and Q8 would have better quality ?
And how would VRAM usage look ?
They both are relatively comparable in size (4GB and 5GB)
I think quality seems to be generally more faithful to FP16, especially with Q6_K and Q8_K. FP8 is a much simpler format. There's some comparison images in the comments of the PR where I added it though I don't think anyone tested this extensively. VRAM usage should be similar though comfy handles loading/unloading if it has to.