Unable to deploy the model on Sagemaker

#40
by AB-THE-AI - opened

Hi,

I used the deployment code snipped provided in Deploy on AWS SDK, but I'm getting error:
UnexpectedStatusException: Error hosting endpoint tei-2025-01-10-12-54-16-347: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint.. Try changing the instance type or reference the troubleshooting page https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference-troubleshooting.html

And in CloudWatch I'm getting:
Caused by:
Could not start backend: Model is not supported: unknown variant qwen2, expected one of bert, xlm-roberta, camembert, roberta, distilbert, nomic_bert at line 19 column 22

Below is the code:

import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

Hub Model configuration. https://huggingface.co/models

hub = {
'HF_MODEL_ID':'dunzhang/stella_en_1.5B_v5'
}

create Hugging Face Model Class

huggingface_model = HuggingFaceModel(
image_uri=get_huggingface_llm_image_uri("huggingface-tei",version="1.2.3"),
env=hub,
role=role,
)

deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
)

send request

predictor.predict({
"inputs": "My name is Clara and I am",
})

Sign up or log in to comment