3.75bpw quant.

#1
by Nexesenex - opened

Hey!
Could you make a 3.75bpw exl2-2 quant of this Llama2 70b please?
I run a 3090+3060 config, and I'd be curious to test the best quant of the base Llama 70b I can run at a decent context (Alpha 2, 6912 ctx) with your new optimized quantization method.
Happy new year 2024, and thanks for providing fast inference at the best quantization quality available for the greatest number of people!

Sign up or log in to comment