I tried to load the model (4.0bpw) to an A100 80GB GPU but failed due to the OOM issue. How can I load the model on multiple GPU using exllama2?
Thank you so much.
· Sign up or log in to comment