AWQ script

by prudant - opened Jun 15, 2024

Jun 15, 2024

Hi, can you share your script to create de awq model? I'm trying to quant another finetune of qwen2, the process complete without errors, but the model dont work when trying to use it =/

erickbp

Owner Jun 18, 2024

are you using AutoAWQ? https://github.com/casper-hansen/AutoAWQ
what library are you using for inference? I'm using vLLM https://docs.vllm.ai/

prudant

Jun 18, 2024

yes, I'm using auto AWQ, the patched version(fix the scale problems) from one otf the qwen contributors, and for engine i 'm using Aphrodite engine, is a very derivated work from vllm, but has more throughput over awq and gptq

erickbp

Owner Jun 18, 2024

I'll give it a try to Aphrodite.
have you tried vLLM? it is fast running in an NVIDIA A100 (80GB) even when they say that AWQ support is under-optimized at the moment...

prudant

Jun 19, 2024

not tried because Aphrodite has some features that vllm dont have (yet), but I give it a tried anyway. But the main point its never has a problem with awq and Aphoridte till now, and its pretty weird

prudant

Jun 20, 2024

•

edited Jun 20, 2024

@erickbp but back to the original question, do you used the https://github.com/casper-hansen/AutoAWQ for quant the qwen2 ? or do you has to customize something in order to get a working the quant.

erickbp

Owner Jun 20, 2024

No, I did not quantize this one myself due to hardware limitation. but yes, I have used AutoAWQ for all the other models that I have quantized. I have used the Azure VM "ND96asr A100 v4" for it ( 8 x A100). Is there any particular error or something that you are facing? I can give it a try but I don't want to setup that expensive VM and all that if you can tell already that it is going to fail.

prudant

Jun 29, 2024

already solved, autoawq did not support native qwen, after last commits now it support it... that was the problem!

erickbp

Owner Jun 30, 2024

glad to hear! Thanks for letting me know.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment