Why is the Q4_0 version the same size as the Q4_K_M one?

by deleted - opened Mar 13

deleted

Mar 13

It's the same with TheBloke's GGUFs, including with Dolphin 2.5 and 2.6, yet with all other Mixtrals I've seen, including Smaug and Nous-Hermes-2, the K_M versions are larger (27.7 vs 25.8 GB).

I thought _K_M meant that higher quantization was used for some blocks, hence the file sizes must be larger than the _0 version.

Anyways, thanks for the release. This is more just about idle curiosity.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment