mayank-mishra/starcoder-GPTQ-4bit-128g · Can I use this model in text-generation-webui?

May 5, 2023

title

May 6, 2023

ive tried yesterday, it has its own inference mode and the model wont get recognized..
i couldnt make it work on colab maybe someone else can. maybe tomorrow i can try a local install..
but it seems you need santacoder_inference to make it work.. i dont think ooba supports it yet

python -m santacoder_inference bigcode/starcoder --wbits 4 --load starcoder-GPTQ-4bit-128g/model.pt

mayank-mishra

Owner May 7, 2023

hey @CyberTimon it should work.
I tried the inference with santacoder using this command a few days ago and it was working.
I haven't tried starcoder though but I don't see a reason why it shouldn't work.

The webui won't work though.

kcramp858

May 7, 2023

•

edited May 8, 2023

I also cannot figure out how to make this thing work.. even running python -m santacoder_inference bigcode/starcoder --wbits 4 --load ../models/starcoder-GPTQ-4bit-128g/model.pt just tries to download the model files again.

I already downloaded starcoder-GPTQ-4bit-128g/model.pt ..

Can anyone point me in the right direction? If I can figure it out, ill write a guide on what I did.

-edit- I used GPT 4 to help me rearrange everything until I could run it. Main issue seems to be the config.json file.. I keep getting the errors below (I didnt paste them all); GPT4 says its because my config.json is from the original starcoder and not from the gptq 4bit 128 version. Still looking into how to solve this issue. Maybe I'm just an idiot? Will confirm soon

for-SantaCoder$ python santacoder_inference.py bigcode/starcoder --wbits 4 --load /mnt/i/ai/text-generation-webui/models/starcoder-GPTQ-4bit-128g/model.pt
Traceback (most recent call last):
File "/mnt/i/ai/text-generation-webui/repositories/GPTQ-for-SantaCoder/santacoder_inference.py", line 114, in
main()
File "/mnt/i/ai/text-generation-webui/repositories/GPTQ-for-SantaCoder/santacoder_inference.py", line 104, in main
model = get_santacoder(args.model, args.load, args.wbits)
File "/mnt/i/ai/text-generation-webui/repositories/GPTQ-for-SantaCoder/santacoder_inference.py", line 58, in get_santacoder
model.load_state_dict(state_dict_original)
File "/home/kcramp/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2056, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for GPTBigCodeForCausalLM:
Unexpected key(s) in state_dict: "transformer.h.0.attn.c_attn.zeros", "transformer.h.0.attn.c_proj.zeros", "transformer.h.0.mlp.c_fc.zeros", "transformer.h.0.mlp.c_proj.zeros", "transformer.h.1.attn.c_attn.zeros", "transformer.h.1.attn.c_proj.zeros", "transformer.h.1.mlp.c_fc.zeros", "transformer.h.1.mlp.c_proj.zeros", "transformer.h.2.attn.c_attn.zeros", "transformer.h.2.attn.c_proj.zeros", "transformer.h.2.mlp.c_fc.zeros", "transformer.h.2.mlp.c_proj.zeros", "transformer.h.3.attn.c_attn.zeros", "transformer.h.3.attn.c_proj.zeros", "transformer.h.3.mlp.c_fc.zeros", "transformer.h.3.mlp.c_proj.zeros", "transformer.h.4.attn.c_attn.zeros", "transformer.h.4.attn.c_proj.zeros", "transformer.h.4.mlp.c_fc.zeros", "transformer.h.4.mlp.c_proj.zeros", " size mismatch for transformer.h.0.attn.c_attn.weight: copying a param with shape torch.Size([768, 6400]) from checkpoint, the shape in current model is torch.Size([6400, 6144]).
size mismatch for transformer.h.0.attn.c_proj.weight: copying a param with shape torch.Size([768, 6144]) from checkpoint, the shape in current model is torch.Size([6144, 6144]).
size mismatch for transformer.h.0.mlp.c_fc.weight: copying a param with shape torch.Size([768, 24576]) from checkpoint, the shape in current model is torch.Size([24576, 6144]).
size mismatch for transformer.h.0.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 6144]) from checkpoint, the shape in current model is torch.Size([6144, 24576]).
size mismatch for transformer.h.1.attn.c_attn.weight: copying a param with shape torch.Size([768, 6400]) from checkpoint, the shape in current model is torch.Size([6400, 6144]).
size mismatch for transformer.h.1.attn.c_proj.weight: copying a param with shape torch.Size([768, 6144]) from checkpoint, the shape in current model is torch.Size([6144, 6144]).
size mismatch for transformer.h.1.mlp.c_fc.weight: copying a param with shape torch.Size([768, 24576]) from checkpoint, the shape in current model is torch.Size([24576, 6144]).
size mismatch for transformer.h.1.mlp.c_proj.weight: copying a param with shape torch.Size([3072, 6144]) from checkpoint, the shape in current model is torch.Size([6144, 24576]).

mayank-mishra

Owner May 8, 2023

@kcramp858 yeah, I think that will download the model files from the original repo for the model again.
Let me outline how this works:
The model is loaded in fp16 and then we inject the int8/int4 weights into the model.
Regarding, the error you are seeing, I am unsure. I will investigate.

Thanks for trying it out :)

mayank-mishra

Owner May 8, 2023

Hey guys, sorry.
I have fixed the bug.
Context: I was debugging something and had accidentally hardcoded groupsize to -1.
Can you try specifying --groupsize 128 for starcoder during inference. I just tried and it worked for me :)
Please note for santacoder, you should specify -1.

Fixed in the latest commit: https://github.com/mayank31398/GPTQ-for-SantaCoder/commit/40df38b03e4ebdaf9e5a444e9f7b4b6df79cff39
Please pull the changes :)

mayank-mishra

Owner May 11, 2023

can we close this?

CyberTimon

May 11, 2023

Sure, but it still doesn't work in oobabooga but I don't think you can easily change this as this has to do with the architecture of the model

CyberTimon changed discussion status to closed May 11, 2023

r7l

May 11, 2023

@CyberTimon I've tried this model: https://huggingface.co/GeorgiaTechResearchInstitute/starcoder-gpteacher-code-instruct

It works with oobabooga but it's huge and slow.

mayank-mishra

Owner May 12, 2023

yeah, you will need to quantize that model yourself.
You can take a look at the scripts provided in my repo.

r7l

May 13, 2023

Would this model run on oobabooga if quantized? I've not done anything like this yet. How long would this take and would it be possible with a normal PC and 3090 GPU?

mayank-mishra

Owner May 15, 2023

I am not sure.
I am planning to add a quantized version of starchat by this week too.