Model does not reply ( Is typing.. ) / MetaIX/GPT4-X-Alpaca-30B-4bit

#14
by ilnurshams - opened

Coul you please advise how can I fix the issue when a model does not reply to my message?
So there is an endless typing ( see pictures ). I've reduced token sizes to min. Chat mode.
Running on Windows 11, RTX4090, 64GB RAM

2023-05-20 030013.png
2023-05-20 030153.png

The same problem here, "'LlamaForCausalLM' object has no attribute 'generate_with_streaming'" in the console. Linux, RTX 4090, 24 VRAM

Okay, I fixed it. I don't know what was the reason.

  1. Fresh install of oobabooga one-click installers
  2. Start start_windows file. Don't dowload any model!
  3. I manually downloaded the necessary model files from https://huggingface.co/MetaIX/GPT4-X-Alpaca-30B-4bit/tree/main
    ( see picture in attachment ) and put in oobabooga models folder.
  4. I installed pytorch and cuda via Conda ( download and install Conda first, then run Anaconda Prompt ( miniconda3 ) as admin ). There I run these code ( in order to install pytorch and cuda ). You also need to have python 3.10 as I had it already ( download it )!
    conda create --name gptq python=3.10 -y
    conda activate gptq
    conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

( I used this tutorial, but it is for linux. https://github.com/qwopqwop200/GPTQ-for-LLaMa
I used only code from tutorial to install pytorch and cuda )

  1. Then I run update_windows file in oobabooga main folder.
    Done!

Screenshot 2023-05-25 181833.png

ilnurshams changed discussion status to closed

Sign up or log in to comment