I have now tried two quantizations 8_0, and 6_K, they both fail like you see below.

#2
by BigDeeper - opened

~/ollama/ollama run phi-3-mini-128k-instruct.Q6_K
Error: llama runner process no longer running: -1

microsoft/Phi-3-mini-4k-instruct-gguf does not cause the same error.

See the relevant github issue here:
https://github.com/ggerganov/llama.cpp/issues/6849

Quant Factory org

Quants have been updated with the latest release for llama.cpp

munish0838 changed discussion status to closed

Sign up or log in to comment