Doubts about using the AutoGPTQ conversion model
Thank you very much, TheBloke, for your work. I have previously used many GPTQ models created by you. Now, I want to try synthesizing my own GPTQ model. Currently, I am trying to use the GPTQ-for-LLaMa synthesis model, but it is only compatible with GPTQ-for-LLaMa. When I load it with AutoGPTQ, it generates incoherent responses. Is there a way to make the model compatible with both GPTQ-for-LLaMa and AutoGPTQ? Thank you very much!π
AutoGPTQ should support models made with GPTQ-for-LLaMa. Did you create a quantize_config.json
file, or in Python code manually pass in a BaseQuantizeConfig() set up appropriately?
Gibberish output usually occurs when you have a mismatch on the desc_act/--act-order setting. Eg if you made the model with --act-order
in GPTQ-for-LLaMa, but then didn't set "desc_act": true
in quantize_config.json for AutoGPTQ. Or vice versa.
AutoGPTQ should support models made with GPTQ-for-LLaMa. Did you create a
quantize_config.json
file, or in Python code manually pass in a BaseQuantizeConfig() set up appropriately?Gibberish output usually occurs when you have a mismatch on the desc_act/--act-order setting. Eg if you made the model with
--act-order
in GPTQ-for-LLaMa, but then didn't set"desc_act": true
in quantize_config.json for AutoGPTQ. Or vice versa.
Thank you very much!ππ The problem has been resolved.