2bit
#2
by
KnutJaegersberg
- opened
You should get your model 2 bit quantized by https://huggingface.co/GreenBitAI/LLaMA-3B-2bit-groupsize32
so we can use all as much as possible context length in best quality on consumer hardware.