How to do inferance?
What should i use to do inference with this model? I tried GPTQ-for-LLaMa but it showed checkpoint shape mismatch errors.
Please show an error log, what specs you are running, any extensions like the text generation UI... etc. More info the better.
I just installed/tried the GPTQ-for-LLaMa code from github with the llama_inference.py. I tried many lora/non lora 4bit models from HF and it seems like only the ozcur/alpaca-native-4bit working for me. The other ones give RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM: Missing key(s) in state_dict: ... Unexpected key(s) in state_dict: ... size mismatch for model.layers ... errors.
Must be a rename in one of the libs, i compared the missing/unexpected keys ex:
missing: model.layers.0.self_attn.q_proj.qzeros
unexpected: model.layers.0.self_attn.q_proj.zeros
Yep, seems like they renamed some stuff to broke the old stuff as usually happens:
https://github.com/qwopqwop200/GPTQ-for-LLaMa/commit/a270974e732884126ddb36f64d0a0a25261bb94f
If anyone has the same problem just downgrade to the older version of the lib:
git checkout 468c47c01b4fe370616747b6d69a2d3f48bab5e4
python setup_cuda.py install
Seems to be working fine now :)
Now we cannot use this fix since webui also updated. I got parameter mismatch if i checkout an older commit
New model is uploaded, going to close this issue.
New model is uploaded, going to close this issue.
Any idea why text-generation-webui cannot find config.json file? I have everything needed in the folder.
https://github.com/oobabooga/text-generation-webui/issues/613
Hope someone can figure it out
what about this?
https://github.com/oobabooga/text-generation-webui/issues/734#issuecomment-1496864694
This branch contains the necessary changes to run the upstream cuda branch: https://github.com/oobabooga/text-generation-webui/tree/new-qwop