Errors when deploying to AWS Sagemaker
I got an error when deploying this model to AWS Sagemaker.
"No safetensors weights found for model bigcode/starcoder at revision None. Converting PyTorch weights to safetensors."
It seems Sagemaker expects one bin file "model.pth" or "pytorch_model.bin"
but this repo has many bin files like "pytorch_model-00003-of-00007.bin" etc..
I don't think I can simply contact those bin files.
Anyone has encountered this issue?
I also faced, don't know how to solve it
I passed this error.
Sagemaker will actually do the conversion for you. But you need to give it more time.
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.8xlarge",
container_startup_health_check_timeout=1200,
)
Set up the container_startup_health_check_timeout to a bigger number and it will pass this error.
But I encountered the next error
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 288.00 MiB (GPU 0; 22.20 GiB total capacity; 19.72 GiB already allocated; 143.12 MiB free;
21.11 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.
See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I upgraded to a bigger instance type, and played with param PYTORCH_CUDA_ALLOC_CONF but the error persisted.
Let me know if you see the same error.
It worked by putting it on the AWS instance type: ml.g4dn.12xlarge and setting SM_NUM_GPUS: "4"
Yes, I got it worked with these configs. Thank you so much~