OOM Issue with 16GB GPU

#9
by ftopal - opened

Hey there,

I am trying to run this model on kaggle using P100 ( 16GB) and it asks to allocated 64GB memory so I am confused as model size is 1.3 GB and parameter size is less 1B. Any pointers here why this is happening?

image.png

Alibaba-NLP org

Hi, this could be caused by either the length of the text being too long or the batch_size being too large. Perhaps you could try processing a much smaller amount of data at a time to see if it runs successfully.

Did you make sure to go

model.eval()
with torch.no_grad():
....

izhx changed discussion status to closed

Sign up or log in to comment