Try to run with dedicated endpoint 4x A100 320GB still get not enough hardware capacity

#11
by trungnx26 - opened
This comment has been hidden
trungnx26 changed discussion status to closed

I'm having the same issue. Were you able to fix it?

trungnx26 changed discussion status to open

I do believe this is huggingface issue with us-east-1. After many try, it work well with just A10.

And I'm trying with my 32gb of ram and a beloved 1060 6gb

Hi @trungnx26 , may I ask which container type you used? Default or Text Generation Inference?
Also, can you tell us your specific endpoint settings? (AWS or GCP? Which Region?)

I have tried deploying in many regions, but it did not work. Thanks!

just Default for Text Generation Inference

Sign up or log in to comment