ISTA-DASLab
/

Mistral-7B-Instruct-v0.2-AQLM-2Bit-2x8

Text Generation

text-generation-inference

Inference Endpoints

8-bit precision

Model card Files Files and versions Community

Mistral-7B-Instruct-v0.2-AQLM-2Bit-2x8 / README.md

SpiridonSunRotator's picture

SpiridonSunRotator

Update README.md

7f3acf3 verified 7 months ago

|

history blame contribute delete

518 Bytes

	---
	library_name: transformers
	tags:
	- mistral
	- finetuned
	- conversational
	- text-generation-inference
	---
	Official [AQLM](https://arxiv.org/abs/2401.06118) quantization of [mistralai/Mistral-7B-Instruct-v0.2
	](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).

	For this quantization, we used 2 codebooks of 8 bits.

	Results:
	\| Model \| Quantization \| MMLU (5-shot) \| Model size, Gb \|
	\|------\|------\|------\|------\|
	\|mistralai/Mistral-7B-Instruct-v0.2 \| None \| 0.5912 \| 14.5 \|
	\| \| 2x8 \| 0.4384 \| 2.3 \|