Need Help in invoking endpoint from sagemaker.

#11
by Faiz4work - opened

I deployed a model from huggingface to sagemaker via S3 and it's deployed successfully. now i want to know how to ask questions from it so i can make inference endpoint with it.

my code in short format:

hub2 = {
'HF_TASK': 'text-generation',
}
model_path = "s3://penchatbotmodel/model.tar.gz"
huggingface_model2 = HuggingFaceModel(
role=role,
env=hub2,
py_version='py36',
transformers_version='4.6.1',
pytorch_version='1.7.1',
model_data=model_path,
)
predictor = huggingface_model2.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
endpoint_name="ChatBotPoint2",
)
prompt="""<|prompter|>How can i stay more active during winter? Give me 3 tips.<|endoftext|><|assistant|>"""

hyperparameters for llm

payload = {
"inputs": prompt,
"messages": [{"role": "user", "content": "10.3 βˆ’ 7988.8133 = "}],
"parameters": {
"do_sample": True,
"top_p": 0.7,
"temperature": 0.7,
"top_k": 50,
"max_new_tokens": 256,
# "repetition_penalty": 1.03,
# "stop": ["<|endoftext|>"]
}
}
predictor.predict(payload )

The Error:
[ModelError :](ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "\u0027mistral\u0027"
})

Sign up or log in to comment