Would it be possible to provide a IQ4_XS quant?
Hi!
TL;DR: Could you please make a IQ4_XS of the magnum-v4-72b model?
First, a HUGE thank you for training and providing these models. They are excellent at helping with story writing!
I was looking through the GGUF files provided with the latest magnum-v4-72b model, and compared to the GGUFs provided with the previous magnum-v2-72b, no I-matrix-quant files are provided. Specifically, I would love a magnum-v4-72b-IQ4_XS.gguf, as it would allow me to run the model on my Macbook M3 Max with 64 gb RAM. The regular Q4_K_M doesn't seem to fit completely in the VRAM (It seems Ollama/or my OS only expose up to 75% (48 GB)). The Q3_K_L would fit, but I'm afraid the quality degradation might be too severe?
Best regards,
Fredrik
Glad you're enjoying them! mradermacher has queued the IQ quants for 123b and 72b: http://hf.tst.eu/status.html