Add support for AQLM

by BlackSamorez - opened

AQLM is a SOTA 2-bit LLM quantization algorithm, that shows incredible precision for its compression ratio. It's fully integrated with transformers and there are quite a few models prequantized.
Adding it to the leaderboard would shed light at what 2-bit quantization is really capable of.

Intel org
edited May 13

hi, @BlackSamorez , we will support AQLM as soon as possible! Thanks~

Intel org
edited May 14

@BlackSamorez please kindly consider to compare your method with AutoRound which have already shown remarkable results at W2G128 and W2G32, as presented in, without introducing any extra overhead at inference,

Intel org

hi @BlackSamorez we add AQLM, we evaluate 2 models now and we will add more models results.


BlackSamorez changed discussion status to closed

Sign up or log in to comment