Inference

#99
by davidhung - opened

The inference is slow. How can I use inference on multiple gpus?

The inference is slow. How can I use inference on multiple gpus?

Would also like to know this

@davidhung There are several ways to do that. Please provide your code so people could help out easily.

@davidhung There are several ways to do that. Please provide your code so people could help out easily.

No offense but, if there are "several ways" why don't you just suggest one instead of being unnecessarily difficult?

No offense but, if there are "several ways" why don't you just suggest one instead of being unnecessarily difficult?

None taken mate :) One option is to use - https://github.com/huggingface/text-generation-inference.

Sign up or log in to comment