How to do inferance?

by NePe - opened Mar 21, 2023

NePe

Mar 21, 2023

What should i use to do inference with this model? I tried GPTQ-for-LLaMa but it showed checkpoint shape mismatch errors.

elinas

Owner Mar 21, 2023

Please show an error log, what specs you are running, any extensions like the text generation UI... etc. More info the better.

NePe

Mar 21, 2023

I just installed/tried the GPTQ-for-LLaMa code from github with the llama_inference.py. I tried many lora/non lora 4bit models from HF and it seems like only the ozcur/alpaca-native-4bit working for me. The other ones give RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM: Missing key(s) in state_dict: ... Unexpected key(s) in state_dict: ... size mismatch for model.layers ... errors.

NePe

Mar 21, 2023

Must be a rename in one of the libs, i compared the missing/unexpected keys ex:
missing: model.layers.0.self_attn.q_proj.qzeros
unexpected: model.layers.0.self_attn.q_proj.zeros

NePe

Mar 21, 2023

Yep, seems like they renamed some stuff to broke the old stuff as usually happens:
https://github.com/qwopqwop200/GPTQ-for-LLaMa/commit/a270974e732884126ddb36f64d0a0a25261bb94f

NePe

Mar 21, 2023

If anyone has the same problem just downgrade to the older version of the lib:
git checkout 468c47c01b4fe370616747b6d69a2d3f48bab5e4
python setup_cuda.py install

Seems to be working fine now :)

Davidliudev

Mar 28, 2023

Now we cannot use this fix since webui also updated. I got parameter mismatch if i checkout an older commit

elinas

Owner Mar 28, 2023

New model is uploaded, going to close this issue.

elinas changed discussion status to closed Mar 28, 2023

iChrist

Mar 28, 2023

New model is uploaded, going to close this issue.

Any idea why text-generation-webui cannot find config.json file? I have everything needed in the folder.
https://github.com/oobabooga/text-generation-webui/issues/613
Hope someone can figure it out

neuralworm

Apr 5, 2023

what about this?
https://github.com/oobabooga/text-generation-webui/issues/734#issuecomment-1496864694

This branch contains the necessary changes to run the upstream cuda branch: https://github.com/oobabooga/text-generation-webui/tree/new-qwop

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment