HIP out of memory

#51
by Renqing - opened

hello,When I ran the example of cell classification notebook, I encountered HIP out of memory. Tried to allocate 674.00 MiB (GPU 0; 15.98 GiB total capacity; 14.77 GiB already allocated; 0 bytes free; 15.43 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. error. Do you have any improvement measures or any requirements for my memory.

Also in silico perturbation.

Thank you for your interest in Geneformer. It looks like based on your error message that you are close to fitting your analysis on your resources. You can change the batch size to fit your resource requirements.

ctheodoris changed discussion status to closed

I used the cell_type_annotation dataprovided by you for testing, and I set the batchsize to 4. But in the silicon pertabution you provided, it is the same situation, how to remove this error

As discussed in the Methods of our manuscript, we used 32G GPUs for our training. Based on the message you posted, your GPUs are about half the size, so this is going to considerably reduce the batch size you can use. You can reduce the batch size to fit your resource requirements or, if available, increase your resources. If you have smaller GPUs but more than one, you can also consider distributing your training.

Sign up or log in to comment