RuntimeError: CUDA out of memory
Hello, thank you for training this, because I am having memory issues when trying to use this model, is this not the "Medium" model of whisper?
I can use the original medium model of whisper normally, but when I try to use this fine tuned model, I am having vram issues, was it not supposed to have the same consumption as the original?
By the way, I can use this model normally on my 3070 ti (8gb) and the consumption of this model on it is 8021mb,
however when I try to use it on my rtx 2060 laptop (6gb) I receive a 'CUDA out of memory' error, but I can use the original medium model normally,
Why is this error occurring? I await your response, thank you.
Hi @mrwikrom 👋🏻
Thanks for using this model. I hope you are finding it helpful.
As you say, this is a fine-tuned model trained from the original Whisper Medium model. As such, consumption should be the same as in the original.
I will investigate the issue you describe, and give you a solution as soon as possible. Meanwhile, I will link you to this Notebook, which shows how to load whisper models in 8bit mode. This should reduce consumption and allow you to use the model in your RTX2060 laptop.
If you have trouble using the method described in the Notebook please let me know and we can work it out together.
Talk to you soon,
José
Thank you for the response, does this colab not seem to be functional for the original whisper (not the one using transformer), or am I wrong?
And this medium model is indeed using more memory than the original medium..
Did you manage to figure out why and if there is any way to fix it?
Once again, thank you.
Hi @mrwikrom were you able to solve this issue?
I would like to better understand your problem. Does it happen when you use OpenAI's original implementation? Or is it a problem when using the transformers
API?
Hello
@jlondonobo
I have noticed that you have attained very good performance of fine-tuning on whisper. I am very interesting about this, and want to make some attempts to fine-tune. However, I encounter a problem "CUDA out of memory". Could you tell me your GPU and computing power on whisper-medium-model? Thanks!
Hello @jlondonobo
I have noticed that you have attained very good performance of fine-tuning on whisper. I am very interesting about this, and want to make some attempts to fine-tune. However, I encounter a problem "CUDA out of memory". Could you tell me your GPU and computing power on whisper-medium-model? Thanks!
Hello @Annie-Li , upon using the whisper-medium-pt model, I noticed that it consumes approximately 7.8 GB of vRAM. If you are experiencing vRAM issues, just like me, I suggest using the Faster Whisper (https://github.com/guillaumekln/faster-whisper) in conjunction with this model: https://huggingface.co/jlondonobo/whisper-large-v2-pt-v3.
By doing so, the vRAM consumption will be reduced to around 4.6 GB instead of 10 GB, and the inference speed will be improved. Although there might be a slight decrease in accuracy, the results will still be of high quality!
Hello @jlondonobo
I have noticed that you have attained very good performance of fine-tuning on whisper. I am very interesting about this, and want to make some attempts to fine-tune. However, I encounter a problem "CUDA out of memory". Could you tell me your GPU and computing power on whisper-medium-model? Thanks!Hello @Annie-Li , upon using the whisper-medium-pt model, I noticed that it consumes approximately 7.8 GB of vRAM. If you are experiencing vRAM issues, just like me, I suggest using the Faster Whisper (https://github.com/guillaumekln/faster-whisper) in conjunction with this model: https://huggingface.co/jlondonobo/whisper-large-v2-pt-v3.
By doing so, the vRAM consumption will be reduced to around 4.6 GB instead of 10 GB, and the inference speed will be improved. Although there might be a slight decrease in accuracy, the results will still be of high quality!
Hello @jlondonobo , thanks for your answer! It is very useful for me. And I will make a new attempt on the Faster Whisper. Thanks!