How to run with multiple GPUs.

by mocherson - opened Apr 4, 2023

Apr 4, 2023

The example in Quickstart loads the model into cpu for running. I know model.cuda() can move the mode to gpu for small models. How to distribute the model to multiple gpus and to run the generation for large models? Thanks.

rskuzma

Apr 4, 2023

This Hugging Face guide for inference with large models using accelerate may be helpful with distributing the model: https://huggingface.co/docs/accelerate/usage_guides/big_modeling

rskuzma changed discussion status to closed Apr 4, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment