TheBloke/koala-13B-GPTQ · Running this model on colab

Jul 10, 2023

Hi, I've been having some trouble testing this model on google colab free. There seems to be an error when I try to run AutoGPTQForCausalLM.from_quantized. Could anyone please help?

My code:

!pip install transformers accelerate einops sentencepiece
!git clone https://github.com/PanQiWei/AutoGPTQ
!pip install ./AutoGPTQ/
import torch
from transformers import AutoTokenizer
from auto_gptq import AutoGPTQForCausalLM
model_name_or_path = "TheBloke/koala-13B-GPTQ-4bit-128g"
model_basename = "koala-13B-4bit-128g.safetensors"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=False)
model = AutoGPTQForCausalLM.from_quantized(model_name_or_path, model_basename=model_basename, device="cuda:0", use_triton=False, use_safetensors=True, torch_dtype=torch.float32, trust_remote_code=False)

My error:

in <cell line: 3>:3 │
│ │
│ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/auto.py:85 in from_quantized │
│ │
│ 82 │ │ model_type = check_and_get_model_type(save_dir or model_name_or_path, trust_remo │
│ 83 │ │ quant_func = GPTQ_CAUSAL_LM_MODEL_MAP[model_type].from_quantized │
│ 84 │ │ keywords = {key: kwargs[key] for key in signature(quant_func).parameters if key │
│ ❱ 85 │ │ return quant_func( │
│ 86 │ │ │ model_name_or_path=model_name_or_path, │
│ 87 │ │ │ save_dir=save_dir, │
│ 88 │ │ │ device_map=device_map, │
│ │
│ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py:666 in from_quantized │
│ │
│ 663 │ │ │ raise TypeError(f"{config.model_type} isn't supported yet.") │
│ 664 │ │ │
│ 665 │ │ if quantize_config is None: │
│ ❱ 666 │ │ │ quantize_config = BaseQuantizeConfig.from_pretrained(model_name_or_path, **k │
│ 667 │ │ │
│ 668 │ │ if model_basename is None: │
│ 669 │ │ │ if quantize_config.model_file_base_name: │
│ │
│ /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py:90 in from_pretrained │
│ │
│ 87 │ │ │ │ │ _commit_hash=commit_hash, │
│ 88 │ │ │ ) │
│ 89 │ │ │
│ ❱ 90 │ │ with open(resolved_config_file, "r", encoding="utf-8") as f: │
│ 91 │ │ │ return cls(**json.load(f)) │
│ 92 │ │
│ 93 │ def to_dict(self): │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: expected str, bytes or os.PathLike object, not NoneType

MLconArtist

Aug 30, 2023

•

edited Aug 30, 2023

Hey, so did you find a way to run gptq models on colab?

deeplypeppermint

Aug 30, 2023

If you are only looking to do inference I was able to run some models on Colab using very simple code taken from other sources:

https://colab.research.google.com/drive/1rqLLYCoD4YlcSkzVp_b0NKpHxUBe3a2S?usp=sharing

I was able to run Falcon (very slow) and Vicuna but not Koala

MLconArtist

Aug 30, 2023

Thanks but I figured out a way to run llama-2 on colab. It was actually pretty fast

deeplypeppermint

Aug 30, 2023

Nice! I haven't tried llama-2 yet but If you don't mind sharing your source it would help me a lot