Reconvert GGUF for the MoE, due to llama.cpp update

by CombinHorizon - opened Apr 26

Apr 26

•

would you please re-convert the GGUF using a newer version (newer than 2024-04apr-03) of llama.cpp for better performance?

see
https://github.com/ggerganov/llama.cpp/#hot-topics
MoE memory layout has been updated - reconvert models for mmap support and regenerate imatrix

thx

Owner Jun 5

•

I found the solution for everyone own this gguf file:
./quantize --allow-requantize can convert the old format to new format.

due to internet traffic limit, I cannot upload the new gguf, sorry for that.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment