Add quants for Q5
#2
by
dzupin
- opened
Hi,
Your quants are best from what is currently on Huggins Space for deepseek-coder-33b.
I just compared your ggml-deepseek-coder-33b-instruct-q4_k_m.gguf with deepseek-coder-33b-instruct.Q4_K_M made by TheBloke on set of my python tests.
Your Q4_K_M models passed with flying colors while the other Q4_K_M failed on almost half of my tests . I expected similar performance but that is not the case.
Would you consider to create quants also for Q5_K_S size? (this should be the largest model that still fit into 24BG of VRAM with 4K context).