EXL2 Quants
#2
by
Adzeiros
- opened
Can someone make some EXL2 quants of this pls?
Preferable 5.0BPW and possibly 4.0BPW (I want to load 32k context, and I am able to do so with 5.0BPW on Midnight Miqu 70B with 4bit kv cache)
I have just quantized it to 4.0bpw.
PedroPareja/Nimbus-Miqu-v0.1-70B-4.0bpw-exl2
I have just quantized it to 4.0bpw.
PedroPareja/Nimbus-Miqu-v0.1-70B-4.0bpw-exl2
Adding to the model card.