Other GGUF quants?
#5
by
Hesajon
- opened
With FP8 I can do 43 frames i2v on my 3060 12GB, but more frames is OOM. The Q8 is too much and I get OOM. Q4 works for any # of frames, but the quality is not great. How do we make other quants, like Q5 or Q6, which might work better i2v for 12GB cards? Is there a conversion script somewhere? Or can you help make more quant variants?
Kind of strange, but with FP8 I can only do 43 frames i2v, but t2v I can do as many frames as I want. The extra memory of the encoded image must just push it over the top for 12GB.