Need Help in invoking endpoint from sagemaker.
I deployed a model from huggingface to sagemaker via S3 and it's deployed successfully. now i want to know how to ask questions from it so i can make inference endpoint with it.
my code in short format:
hub2 = {
'HF_TASK': 'text-generation',
}
model_path = "s3://penchatbotmodel/model.tar.gz"
huggingface_model2 = HuggingFaceModel(
role=role,
env=hub2,
py_version='py36',
transformers_version='4.6.1',
pytorch_version='1.7.1',
model_data=model_path,
)
predictor = huggingface_model2.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
endpoint_name="ChatBotPoint2",
)
prompt="""<|prompter|>How can i stay more active during winter? Give me 3 tips.<|endoftext|><|assistant|>"""
hyperparameters for llm
payload = {
"inputs": prompt,
"messages": [{"role": "user", "content": "10.3 β 7988.8133 = "}],
"parameters": {
"do_sample": True,
"top_p": 0.7,
"temperature": 0.7,
"top_k": 50,
"max_new_tokens": 256,
# "repetition_penalty": 1.03,
# "stop": ["<|endoftext|>"]
}
}
predictor.predict(payload )
The Error:
[ModelError :](ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "\u0027mistral\u0027"
})