Add torchao int4 weight only quantization as an option

#34
No description provided.

Add torchao int4 weight only quantization as an option

Summary:
This is follow up of https://github.com/huggingface/optimum-benchmark/pull/297/
to make torchao available as an option on leader board.

We want to add torchao.autoquant as an option in the end, after we integrate it
to TorchAoConfig later.

Test Plan:
leaderboard?

jerryzh168 changed pull request status to open
Hugging Face Optimum org

Thanks for the great contribution!

I am currently running some benchmarkz to get some values for torchao

baptistecolle changed pull request status to merged

Thanks for merging @baptistecolle , for a little more context, I just upstreamed one popular quantization flavor in torchao (int4_weight_only) for now to get some initial perf data for now, in the future I also plan to upstream another one called "autoquant" (https://github.com/pytorch/ao/blob/main/torchao/quantization/README.md#autoquantization) that will be able to automatically search through all available quantization flavors in torchao and get the best performing model on the specific hardware, under some accuracy constraint (sqnr).

also how is the leaderboard updated? is it documented somewhere?

Hugging Face Optimum org

I just made the repo for the backend of the leaderboard public (again)
https://github.com/huggingface/llm-perf-backend

The documentation is lacking for now as it is more an internal tool to manage the leaderboard

@baptistecolle thanks! when will we be able to see the update to the dashboard itself? Just trying to make sure the changes are reflected in the dashboard

Sign up or log in to comment