google/flan-ul2 · Error in Fine Tunning Flan-UL2 model

Jun 23, 2023

Hi, I am trying to fine-tune the Flan-UL2 model on my dataset. I have two 32-GB GPU's but whenever I am tyring to run my model. It shows,

"RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 31.74 GiB total capacity; 30.74 GiB already allocated; 136.88 MiB free; 30.74 GiB reserved in total by PyTorch)."

Can anybody help me with that?

pvbhanuteja

Jul 8, 2023

Flan-UL2 requires min GPU size of 50GB (that to maybe for inference). For batched training you need clusters. And maybe 64gb might suffice if you load model into both GPUs for inference I think you are trying to load whole model on device 0

Ahatsham

Jul 10, 2023

This comment has been hidden

Ahatsham

Jul 10, 2023

Flan-UL2 requires min GPU size of 50GB (that to maybe for inference). For batched training you need clusters. And maybe 64gb might suffice if you load model into both GPUs for inference I think you are trying to load whole model on device 0

Yes, you are right, I am new in this area, can you help me with how to use multiple GPU's?

pvbhanuteja

Jul 10, 2023

Flan-UL2 requires min GPU size of 50GB (that to maybe for inference). For batched training you need clusters. And maybe 64gb might suffice if you load model into both GPUs for inference I think you are trying to load whole model on device 0

Yes, you are right, I am new in this area, can you help me with how to use multiple GPU's?

Adding this parameter device_map="auto" while loading the pipeline should do the work https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.pipeline.device_map