Error while downloading model
Getting following error while downloading model
Code
import torch
from PIL import Image
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained(
'openbmb/MiniCPM-o-2_6-int4',
trust_remote_code=True,
attn_implementation='sdpa', # sdpa or flash_attention_2
cache_dir='model',
torch_dtype=torch.bfloat16,
init_vision=True,
init_audio=True,
init_tts=True
)
model = model.eval().cuda()
tokenizer = AutoTokenizer.from_pretrained('openbmb/MiniCPM-o-2_6-int4', trust_remote_code=True)
Error
low_cpu_mem_usage
was None, now set to True since model is quantized.
Traceback (most recent call last):
File "/gridbkp/pranshu/minicpm_evaluation/inference_script.py", line 8, in
model = AutoModel.from_pretrained(
File "/gridbkp/pranshu/miniconda3/envs/minicpm/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 559, in from_pretrained
return model_class.from_pretrained(
File "/gridbkp/pranshu/miniconda3/envs/minicpm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3738, in from_pretrained
if metadata.get("format") == "pt":
You need to use autogptq to load the model, I will supplement readme tomorrow.
'''
model = AutoGPTQForCausalMLM.from_quantized(path, torch_dtype=torch.bfloat16, device="cuda:0", trust_remote_code=True, disable_exllama=True)
'''
I just changed the code a little bit, it seems now I got another issue related to autogptq:
self.minicpmo_model_path = "openbmb/MiniCPM-o-2_6-int4"
self.model_version = "2.6"
with torch.no_grad():
# self.minicpmo_model = AutoModel.from_pretrained(self.minicpmo_model_path, trust_remote_code=True, torch_dtype=self.target_dtype, attn_implementation='sdpa', low_cpu_mem_usage=True)
self.minicpmo_model = AutoGPTQForCausalLM.from_quantized(self.minicpmo_model_path, trust_remote_code=True, torch_dtype=self.target_dtype, device=self.device, disable_exllama=True)
error log:
Traceback (most recent call last):
File "D:\workspace-ai\MiniCPM-o\web_demos\minicpm-o_2.6\model_server.py", line 602, in <module>
stream_manager = StreamManager()
File "D:\workspace-ai\MiniCPM-o\web_demos\minicpm-o_2.6\model_server.py", line 97, in __init__
self.minicpmo_model = AutoGPTQForCausalLM.from_quantized(self.minicpmo_model_path, trust_remote_code=True, torch_dtype=self.target_dtype, device=self.device, disable_exllama=True)
File "C:\Users\lyon\anaconda3\envs\minicpm\lib\site-packages\auto_gptq\modeling\auto.py", line 114, in from_quantized
model_type = check_and_get_model_type(model_name_or_path, trust_remote_code)
File "C:\Users\lyon\anaconda3\envs\minicpm\lib\site-packages\auto_gptq\modeling\_utils.py", line 305, in check_and_get_model_type
raise TypeError(f"{config.model_type} isn't supported yet.")
TypeError: minicpmo isn't supported yet.
I did some research, and found minicpmo
is actually not supported in official autogptq repository. And I find this repo has private changes to support minicpm
but not minicpmo
: https://github.com/AutoGPTQ/AutoGPTQ/compare/main...LDLINGLINGLING:AutoGPTQ:minicpm_autogptq
Looking forward to your nice readme today :)
BTW: I use autogptq from here: pip install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
So I can see AutoGPTQForCausalLM in AutoGPTQ official repository but there is no AutoGPTQForCausalMLM. Can you provide repo link for the same
We will create a new autogptq repo to support int4 inference for MiniCPM-o 2.6 tomorrow.
Please use int4 inference based on our updated README.