Unable to deploy the model on Sagemaker
Hi,
I used the deployment code snipped provided in Deploy on AWS SDK, but I'm getting error:
UnexpectedStatusException: Error hosting endpoint tei-2025-01-10-12-54-16-347: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint.. Try changing the instance type or reference the troubleshooting page https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference-troubleshooting.html
And in CloudWatch I'm getting:
Caused by:
Could not start backend: Model is not supported: unknown variant qwen2
, expected one of bert
, xlm-roberta
, camembert
, roberta
, distilbert
, nomic_bert
at line 19 column 22
Below is the code:
import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri
try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'dunzhang/stella_en_1.5B_v5'
}
create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
image_uri=get_huggingface_llm_image_uri("huggingface-tei",version="1.2.3"),
env=hub,
role=role,
)
deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
)
send request
predictor.predict({
"inputs": "My name is Clara and I am",
})