Not working in TGI

#3
by angeligareta - opened

Has someone made this instance work using a HuggingFace Model and deploying it to SageMaker? I am not able to deploy it, any help on which configuration to use would be welcome. I have tried
config = { "HF_MODEL_ID": "01-ai/Yi-34B-Chat-4bits" }
and
config = { "HF_MODEL_ID": "01-ai/Yi-34B-Chat-4bits", 'QUANTIZE': 'awq' }

This was an error from Sagemaker. A workaround is to generate your own dockerfile with TGI

FROM ghcr.io/huggingface/text-generation-inference:1.1.0

COPY sagemaker-entrypoint.sh entrypoint.sh
RUN chmod +x entrypoint.sh

ENTRYPOINT ["./entrypoint.sh"]

Then build it and upload it to ECR and then input that image_uri to the HuggingFaceModel

huggingface_model = HuggingFaceModel(
    image_uri=custom_image_uri,
    env=hub,
    role=role, 
)
angeligareta changed discussion status to closed

Sign up or log in to comment