Correct machine to deploy the model on AWS Sagemaker

#11

by LorenzoCevolaniAXA - opened Feb 16, 2024

Feb 16, 2024

I am trying to deploy the model in an endpoint inside AWS Sagemaker.
I have tried several instances from "ml.g5.4xlarge" with 4 GPUs, which should be the standard way of deploying a 13B model as this one, to the bigger "ml.g5.48xlarge" with 8 GPUS and I always get an error about an OOM in one of the GPUS, is there something I can try to make it work?
Do you have a configuration that is working on your side?

Ilanmeiss

4 days ago

Having same issues. I tryed ml.g5.12xlarge with 4 24GB GPUs. should defiantly be enough but had no success.

Ilanmeiss

4 days ago

I solved it on ml.g5.12xlarge
You can follow this tutorial
https://dgallitelli95.medium.com/using-aya-101-in-amazon-sagemaker-4c1f30dfa5cd
Notice the version of get_huggingface_llm_image_uri("huggingface",version="1.1.0")
I did with version="2.0.2" as I usually do and it did not work. It does work with version 1.1.0

LorenzoCevolaniAXA

4 days ago

Thanks! I will try it out!

LorenzoCevolaniAXA changed discussion status to closed 4 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment