openbmb/MiniCPM-Llama3-V-2_5 · CUDA out of memory (4090 24G)

Jun 20

你好，使用你提供的代码，报错为：torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU . 请问如何解决？

import torch
from PIL import Image
from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained('/workspace/models/vlm/MiniCPM-Llama3-V-2_5', trust_remote_code=True, )
tokenizer = AutoTokenizer.from_pretrained('/workspace/models/vlm/MiniCPM-Llama3-V-2_5', trust_remote_code=True)
model.to(device='cuda')
tokenizer.to(device='cuda')
model.eval()

image = Image.open('/workspace/models/341705737396_.pic_hd.jpg').convert('RGB')
question = '详细描述图片的的内容'
msgs = [{'role': 'user', 'content': question}]

res = model.chat(
    image=image,
    msgs=msgs,
    tokenizer=tokenizer,
    sampling=True, # if sampling=False, beam_search will be used by default
    temperature=0.1,
    # system_prompt='' # pass system_prompt if needed
)
print(res)

weiminw

Jun 20

执行的时候，我看控制台，还在下载checkpoint , 不知道为什么？

Loading checkpoint shards: 100%|██████████████████| 7/7 [00:02<00:00,  2.36it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

william0014

Jun 21

已解决

finalf0 changed discussion status to closed Jul 24