ISTA-DASLab
/

Mistral-7B-v0.1-AQLM-2Bit-1x16-hf

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Mistral-7B-v0.1-AQLM-2Bit-1x16-hf / README.md

SpiridonSunRotator's picture

SpiridonSunRotator

Added Mistral evaluation.

d8e3850 verified 4 months ago

|

raw history blame contribute delete

No virus

686 Bytes

	Official [AQLM](https://arxiv.org/abs/2401.06118) quantization of [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1).

	For this quantization, we used 1 codebook of 16 bits.


	Results (0-shot `acc`):
	\| Model \| Quantization \| WinoGrande \| PiQA \| HellaSwag \| ArcE \| ArcC \| Model size, Gb \|
	\|------\|------\|------\|-------\|-------\|-------\|------\|------\|
	\|Mistral-7B-v0.1\| None \| 0.7364 \| 0.8047 \| 0.6115 \| 0.7887 \| 0.4923 \| 14.5 \|
	\| \|1x16\| 0.6914 \| 0.7845 \| 0.5745 \| 0.7504 \| 0.4420 \| 2.51 \|

	To learn more about the inference, as well as the information on how to quantize models yourself, please refer to the [official GitHub repo](https://github.com/Vahe1994/AQLM).