SpiridonSunRotator's picture
Updated description and the metrics
d8e162a verified

Official AQLM quantization of google/gemma-2b.

For this quantization, we used 2 codebooks of 8 bits.

Results:

Model AQLM scheme WinoGrande PiQA HellaSwag ArcE ArcC Model size, Gb
gemma-2b 2x8 0.5801 0.6828 0.3891 0.5791 0.2534 1.6

To learn more about the inference, as well as the information on how to quantize models yourself, please refer to the official GitHub repo.