Q2 and Q3 model returning gibberish

#2
by atthoriqgf - opened

hello guys, currently im trying to use the q2_k and q3_k_m model using llama cpp. When i try to run as server or cli, even the simplest hello will return long unending gibberish. Is there something missing when i run the model? When i try the q4 and q6 it return correctly.
Heres the command for running the model using llamacpp

llama-server -m D:\llamacpp\llama.cpp\models\qwen2.5-coder-32b-instruct-q4_k_m.gguf --port 8080 -ngl 46

image.png

my specs:
Running on Windows 10
Ryzen 5 5600
64GB ram 3600mhz
RX6800 16GB
using Vulkan for llamacpp

Sign up or log in to comment