text-generation-webui: AttributeError: 'Offload_LlamaModel' object has no attribute 'preload', when trying to generate text
Traceback (most recent call last):
File "D:\oobabooga\text-generation-webui\modules\callbacks.py", line 66, in gentask
ret = self.mfunc(callback=_callback, **self.kwargs)
File "D:\oobabooga\text-generation-webui\modules\text_generation.py", line 290, in generate_with_callback
shared.model.generate(**kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\transformers\generation\utils.py", line 1485, in generate
return self.sample(
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\transformers\generation\utils.py", line 2524, in sample
outputs = self(
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\transformers\models\llama\modeling_llama.py", line 687, in forward
outputs = self.model(
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "D:\oobabooga\text-generation-webui\repositories\GPTQ-for-LLaMa\llama_inference_offload.py", line 135, in forward
if idx <= (self.preload - 1):
File "C:\Users\FuckMicrosoftPC.conda\envs\textgen\lib\site-packages\torch\nn\modules\module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'Offload_LlamaModel' object has no attribute 'preload'
Note: trying to load the GPTQ safetensors model
Try the ooba fork of GPTQ. Do not use the new one.
Doesn't seem to do much, unless I am tarded and using the wrong version. Using https://github.com/oobabooga/GPTQ-for-LLaMa/
That is definitely the right one. I only have this set up on linux tho.
Worked for me after I downloaded the config.json in the other folder
What folder? Where? How?
https://huggingface.co/reeducator/vicuna-13b-free/tree/main/hf-output
Also has FP16 so you can convert it to whatever you want. Like act order and no group size.
The issue for me was that I was using an outdated gptq-for-llama repo. I checked the readme and it says to delete that folder before updating.
For anyone that needs instruction:
Windows: Simply delete the GPTQ-for-LLaMa folder (located at /text-generation-webui/repositories/) then run the update_windows.bat if you used the windows version
Linux: Delete the same folder as the windows one, replace with the newest from https://github.com/oobabooga/GPTQ-for-LLaMa.git
Clone the repo to your machine in the repositories folder with:
git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda
cd to the newly cloned directory, then
python -m pip install -r requirements.txt