Quantization error

by chenduo - opened Apr 16, 2024

Apr 16, 2024

Has anyone successfully used LLAMA-CPP to quantize the model for gguf? I tried to quantize it to q5_k_m weights, but encountered a weight loss error during the quantization import. Why is this happening?

jeiku

The Chaotic Neutrals org Apr 16, 2024

I was able to successfully quantize to 3_k_s for testing. I no longer have the quantized model, but I didn't have to do anything special to get it working. I have never seen an error like that during quantization.

I have no interest in redownloading this model for quantization. If you are unable to quantize it, then I recommend finding a different model.

chenduo

Apr 17, 2024

I searched for related issues on GitHub and found that this might be due to the lack of support for MoE in llama-cpp. I will upgrade the version of llama-cpp first and then see if this error still occurs.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment