Add support for AQLM

#1
by BlackSamorez - opened

AQLM is a SOTA 2-bit LLM quantization algorithm, that shows incredible precision for its compression ratio. It's fully integrated with transformers and there are quite a few models prequantized.
Adding it to the leaderboard would shed light at what 2-bit quantization is really capable of.

Intel org
edited May 13

hi, @BlackSamorez , we will support AQLM as soon as possible! Thanks~

Intel org
edited May 14

@BlackSamorez please kindly consider to compare your method with AutoRound which have already shown remarkable results at W2G128 and W2G32, as presented in https://github.com/intel/auto-round/blob/main/docs/acc.md, without introducing any extra overhead at inference,

Intel org

hi @BlackSamorez we add AQLM, we evaluate 2 models now and we will add more models results.

image.png

BlackSamorez changed discussion status to closed

Sign up or log in to comment