glm-4v-9b自己call forward的时候cuda out of memory

#25
by FearandDreams - opened

@zRzRzRzRzRzRzR 作者您好,我这边正常用demo里的outputs = self.basemodel.generate(**inputs, **gen_kwargs)时一切正常,但是自己call output = self.base_model.forward(input_ids=tokens, images=image_tensor, return_dict=True)时会CUDA out of memory。我想把output里的logits提取出来所以的话还有别的什么方法吗?gpu显存是48gb

Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University org

有没有完整的测试代码呢?我好复现一下

@zRzRzRzRzRzRzR 您好,测试代码如下(主要就是call forward):

import torch
from PIL import Image
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer, BitsAndBytesConfig
from huggingface_hub import login
login(token="your token")

import os

os.environ["TRANSFORMERS_CACHE"] = "enter if needed"

device = "cuda:3" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained("THUDM/glm-4v-9b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
"THUDM/glm-4v-9b",
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
trust_remote_code=True
).to(device)

image = Image.open("test2.jpg").convert('RGB')
image_tensor = tokenizer.apply_chat_template([{"role": "user", "image": image}],
add_generation_prompt=True, tokenize=True, return_tensors="pt",
return_dict=True)["images"]
prompt = "hello"
prompt_tokens = tokenizer.encode(prompt)
prompt_tokens = torch.tensor([prompt_tokens], device=device)

image_tensor_to_update = image_tensor.clone().detach().requires_grad_(True)

with torch.cuda.amp.autocast():
tokens = prompt_tokens.to(device)
image_tensor = image_tensor.to(device)

output = model.forward(
            input_ids=prompt_tokens,
            images=image_tensor,
            return_dict=True
        )

logits = output.logits

@zRzRzRzRzRzRzR 您好,我这边还是没有解决OOM的问题,但我在forward外面套一个no_grad就没OOM了,问题应该在grad上面

@zRzRzRzRzRzRzR 我这边又试了一下A100,也还是oom

image.png

Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University org

我这边已经调通了模型的微调,4v-9b全参激活的话肯定80G不够啊

@zRzRzRzRzRzRzR 感谢解答!不过我这边不需要全参的gradients。只需要input的gradients的话要怎么改code?

Sign up or log in to comment