CUDA out of memory without Gradio
#4
by
snakelemma
- opened
I can run the model locally through Gradio, but not standalone. For Gradio I use the code from https://huggingface.co/spaces/qnguyen3/nanoLLaVA.
The standalone version (using the sample code) gives me "CUDA out of memory" with NVIDIA GeForce RTX 4050 6Gb, while through Gradio the memory is not filled.
The error is thrown when SigLipAttention is loaded. Any idea why less vram is used with Gradio?
snakelemma
changed discussion title from
CUDA out of memory without using Gradio
to CUDA out of memory without Gradio