Multiple GPUs for inference error
same error
same error
I'm very sorry, but this problem has troubled me for a long time.
Different devices or different numbers of GPUs always trigger this issue in various ways.
I have a silly workaround, which involves modifying the source code of utils.py
in the transformers
library, manually moving the tensors to the same device.
If there is a better method, please let me know!
Could you share this please:
"I have a silly workaround, which involves modifying the source code of utils.py in the transformers library, manually moving the tensors to the same device."
@czczup
As a silly person myself, I would also be interested to know about your changes in utils.py
. Can you post a diff maybe? Thanks.
Please refer to the new readme code. By placing the input and output layers of the LLM on a single device, it should now work without needing to modify utils.py
, and this issue should no longer occur.
Hey bud! Just saw that you updated the readme! Wow it works! Thanks a ton man!! You rock!
Please refer to the new readme code. By placing the input and output layers of the LLM on a single device, it should now work without needing to modify
utils.py
, and this issue should no longer occur.
It works,thakns