OOM Issue with 16GB GPU

by ftopal - opened May 4

May 4

Hey there,

I am trying to run this model on kaggle using P100 ( 16GB) and it asks to allocated 64GB memory so I am confused as model size is 1.3 GB and parameter size is less 1B. Any pointers here why this is happening?

izhx

Alibaba-NLP org May 6

Hi, this could be caused by either the length of the text being too long or the batch_size being too large. Perhaps you could try processing a much smaller amount of data at a time to see if it runs successfully.

free-variation

May 8

Did you make sure to go

model.eval()
with torch.no_grad():
....

izhx changed discussion status to closed Aug 14

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment