Comparisons to FP8 e4m3fn ?

#3
by NielsGx - opened

Hi,
I'm currently using the FP8 e4m3fn variant (4.55GB) on a 16GB VRAM GPU.

Mainly interested to know if Q6_K and Q8 would have better quality ?

And how would VRAM usage look ?
They both are relatively comparable in size (4GB and 5GB)

Owner

I think quality seems to be generally more faithful to FP16, especially with Q6_K and Q8_K. FP8 is a much simpler format. There's some comparison images in the comments of the PR where I added it though I don't think anyone tested this extensively. VRAM usage should be similar though comfy handles loading/unloading if it has to.

Sign up or log in to comment