Would it be possible to provide a IQ4_XS quant?

#1
by fsmedberg - opened

Hi!

TL;DR: Could you please make a IQ4_XS of the magnum-v4-72b model?

First, a HUGE thank you for training and providing these models. They are excellent at helping with story writing!

I was looking through the GGUF files provided with the latest magnum-v4-72b model, and compared to the GGUFs provided with the previous magnum-v2-72b, no I-matrix-quant files are provided. Specifically, I would love a magnum-v4-72b-IQ4_XS.gguf, as it would allow me to run the model on my Macbook M3 Max with 64 gb RAM. The regular Q4_K_M doesn't seem to fit completely in the VRAM (It seems Ollama/or my OS only expose up to 75% (48 GB)). The Q3_K_L would fit, but I'm afraid the quality degradation might be too severe?

Best regards,
Fredrik

Anthracite org

Glad you're enjoying them! mradermacher has queued the IQ quants for 123b and 72b: http://hf.tst.eu/status.html

lucyknada changed discussion status to closed

Sign up or log in to comment