EXL2 Quants

#2
by Adzeiros - opened

Can someone make some EXL2 quants of this pls?
Preferable 5.0BPW and possibly 4.0BPW (I want to load 32k context, and I am able to do so with 5.0BPW on Midnight Miqu 70B with 4bit kv cache)

I have just quantized it to 4.0bpw.
PedroPareja/Nimbus-Miqu-v0.1-70B-4.0bpw-exl2

I have just quantized it to 4.0bpw.
PedroPareja/Nimbus-Miqu-v0.1-70B-4.0bpw-exl2

Adding to the model card.

Sign up or log in to comment