Spaces:
Running
Add torchao int4 weight only quantization as an option
Add torchao int4 weight only quantization as an option
Summary:
This is follow up of https://github.com/huggingface/optimum-benchmark/pull/297/
to make torchao available as an option on leader board.
We want to add torchao.autoquant as an option in the end, after we integrate it
to TorchAoConfig later.
Test Plan:
leaderboard?
Thanks for the great contribution!
I am currently running some benchmarkz to get some values for torchao
Thanks for merging @baptistecolle , for a little more context, I just upstreamed one popular quantization flavor in torchao (int4_weight_only) for now to get some initial perf data for now, in the future I also plan to upstream another one called "autoquant" (https://github.com/pytorch/ao/blob/main/torchao/quantization/README.md#autoquantization) that will be able to automatically search through all available quantization flavors in torchao and get the best performing model on the specific hardware, under some accuracy constraint (sqnr).
also how is the leaderboard updated? is it documented somewhere?
I just made the repo for the backend of the leaderboard public (again)
https://github.com/huggingface/llm-perf-backend
The documentation is lacking for now as it is more an internal tool to manage the leaderboard
@baptistecolle thanks! when will we be able to see the update to the dashboard itself? Just trying to make sure the changes are reflected in the dashboard