Running this model on colab
Hi, I've been having some trouble testing this model on google colab free. There seems to be an error when I try to run AutoGPTQForCausalLM.from_quantized. Could anyone please help?
My code:
!pip install transformers accelerate einops sentencepiece
!git clone https://github.com/PanQiWei/AutoGPTQ
!pip install ./AutoGPTQ/
import torch
from transformers import AutoTokenizer
from auto_gptq import AutoGPTQForCausalLM
model_name_or_path = "TheBloke/koala-13B-GPTQ-4bit-128g"
model_basename = "koala-13B-4bit-128g.safetensors"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=False)
model = AutoGPTQForCausalLM.from_quantized(model_name_or_path, model_basename=model_basename, device="cuda:0", use_triton=False, use_safetensors=True, torch_dtype=torch.float32, trust_remote_code=False)
My error:
in <cell line: 3>:3 β
β β
β /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/auto.py:85 in from_quantized β
β β
β 82 β β model_type = check_and_get_model_type(save_dir or model_name_or_path, trust_remo β
β 83 β β quant_func = GPTQ_CAUSAL_LM_MODEL_MAP[model_type].from_quantized β
β 84 β β keywords = {key: kwargs[key] for key in signature(quant_func).parameters if key β
β β± 85 β β return quant_func( β
β 86 β β β model_name_or_path=model_name_or_path, β
β 87 β β β save_dir=save_dir, β
β 88 β β β device_map=device_map, β
β β
β /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py:666 in from_quantized β
β β
β 663 β β β raise TypeError(f"{config.model_type} isn't supported yet.") β
β 664 β β β
β 665 β β if quantize_config is None: β
β β± 666 β β β quantize_config = BaseQuantizeConfig.from_pretrained(model_name_or_path, **k β
β 667 β β β
β 668 β β if model_basename is None: β
β 669 β β β if quantize_config.model_file_base_name: β
β β
β /usr/local/lib/python3.10/dist-packages/auto_gptq/modeling/_base.py:90 in from_pretrained β
β β
β 87 β β β β β _commit_hash=commit_hash, β
β 88 β β β ) β
β 89 β β β
β β± 90 β β with open(resolved_config_file, "r", encoding="utf-8") as f: β
β 91 β β β return cls(**json.load(f)) β
β 92 β β
β 93 β def to_dict(self): β
β°βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ―
TypeError: expected str, bytes or os.PathLike object, not NoneType
Hey, so did you find a way to run gptq models on colab?
If you are only looking to do inference I was able to run some models on Colab using very simple code taken from other sources:
https://colab.research.google.com/drive/1rqLLYCoD4YlcSkzVp_b0NKpHxUBe3a2S?usp=sharing
I was able to run Falcon (very slow) and Vicuna but not Koala
Thanks but I figured out a way to run llama-2 on colab. It was actually pretty fast
Nice! I haven't tried llama-2 yet but If you don't mind sharing your source it would help me a lot