Other GGUF quants?

#5
by Hesajon - opened

With FP8 I can do 43 frames i2v on my 3060 12GB, but more frames is OOM. The Q8 is too much and I get OOM. Q4 works for any # of frames, but the quality is not great. How do we make other quants, like Q5 or Q6, which might work better i2v for 12GB cards? Is there a conversion script somewhere? Or can you help make more quant variants?

Kind of strange, but with FP8 I can only do 43 frames i2v, but t2v I can do as many frames as I want. The extra memory of the encoded image must just push it over the top for 12GB.

Sign up or log in to comment