Gptq version? Thanks!

#3
by YaTharThShaRma999 - opened

Would be nice to have a gptq version so we could run it on limited vram. Thanks for making these models!

@TheBloke Check this one out, bro

@sinanisler i think i would reccomend the gguf version instead currently. Since exllama does not support multimodel gptq, llama.cpp with gguf is much much faster then gptq.
Heres a llava 7b gguf
https://huggingface.co/jartine/llava-v1.5-7B-GGUF

@YaTharThShaRma999
thank you didn't see this one

I will try it :)

Sign up or log in to comment