Error while downloading model

#1
by pranshu3105 - opened

Getting following error while downloading model

Code

import torch
from PIL import Image
from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained(
    'openbmb/MiniCPM-o-2_6-int4',
    trust_remote_code=True,
    attn_implementation='sdpa', # sdpa or flash_attention_2
    cache_dir='model',
    torch_dtype=torch.bfloat16,
    init_vision=True,
    init_audio=True,
    init_tts=True
)

model = model.eval().cuda()
tokenizer = AutoTokenizer.from_pretrained('openbmb/MiniCPM-o-2_6-int4', trust_remote_code=True)

Error

low_cpu_mem_usage was None, now set to True since model is quantized.
Traceback (most recent call last):
File "/gridbkp/pranshu/minicpm_evaluation/inference_script.py", line 8, in
model = AutoModel.from_pretrained(
File "/gridbkp/pranshu/miniconda3/envs/minicpm/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 559, in from_pretrained
return model_class.from_pretrained(
File "/gridbkp/pranshu/miniconda3/envs/minicpm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3738, in from_pretrained
if metadata.get("format") == "pt":

You need to use autogptq to load the model, I will supplement readme tomorrow.

'''
model = AutoGPTQForCausalMLM.from_quantized(path, torch_dtype=torch.bfloat16, device="cuda:0", trust_remote_code=True, disable_exllama=True)
'''

I just changed the code a little bit, it seems now I got another issue related to autogptq:

        self.minicpmo_model_path = "openbmb/MiniCPM-o-2_6-int4"
        self.model_version = "2.6"
        with torch.no_grad():
            # self.minicpmo_model = AutoModel.from_pretrained(self.minicpmo_model_path, trust_remote_code=True, torch_dtype=self.target_dtype, attn_implementation='sdpa', low_cpu_mem_usage=True)
            self.minicpmo_model = AutoGPTQForCausalLM.from_quantized(self.minicpmo_model_path, trust_remote_code=True, torch_dtype=self.target_dtype, device=self.device, disable_exllama=True)

error log:

Traceback (most recent call last):
  File "D:\workspace-ai\MiniCPM-o\web_demos\minicpm-o_2.6\model_server.py", line 602, in <module>
    stream_manager = StreamManager()
  File "D:\workspace-ai\MiniCPM-o\web_demos\minicpm-o_2.6\model_server.py", line 97, in __init__
    self.minicpmo_model = AutoGPTQForCausalLM.from_quantized(self.minicpmo_model_path, trust_remote_code=True, torch_dtype=self.target_dtype, device=self.device, disable_exllama=True)
  File "C:\Users\lyon\anaconda3\envs\minicpm\lib\site-packages\auto_gptq\modeling\auto.py", line 114, in from_quantized
    model_type = check_and_get_model_type(model_name_or_path, trust_remote_code)
  File "C:\Users\lyon\anaconda3\envs\minicpm\lib\site-packages\auto_gptq\modeling\_utils.py", line 305, in check_and_get_model_type
    raise TypeError(f"{config.model_type} isn't supported yet.")
TypeError: minicpmo isn't supported yet.

I did some research, and found minicpmo is actually not supported in official autogptq repository. And I find this repo has private changes to support minicpm but not minicpmo: https://github.com/AutoGPTQ/AutoGPTQ/compare/main...LDLINGLINGLING:AutoGPTQ:minicpm_autogptq

Looking forward to your nice readme today :)

BTW: I use autogptq from here: pip install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/

So I can see AutoGPTQForCausalLM in AutoGPTQ official repository but there is no AutoGPTQForCausalMLM. Can you provide repo link for the same

OpenBMB org

We will create a new autogptq repo to support int4 inference for MiniCPM-o 2.6 tomorrow.

Please use int4 inference based on our updated README.

Sign up or log in to comment