Pix2struct Sagemaker deployment Failing because of task Incompatibility
using "from sagemaker.huggingface import HuggingFaceModel" and deploying with the defined task:
hub = {
'HF_MODEL_ID':'google/pix2struct-docvqa-base',
'HF_TASK': 'visual-question-answering'
}
Is successfully deploying, but the result is a useless endpoint for inference. The issue is that the error <"message": "A header text must be provided for VQA models."> pops up, caused by the incompatibility between "/pix2struct/image_processing_pix2struct.py" and "/pipelines/visual_question_answering.py":
File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1109, in call
return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/visual_question_answering.py", line 117, in preprocess
image_features = self.image_processor(images=image, return_tensors=self.framework)
File "/opt/conda/lib/python3.10/site-packages/transformers/image_processing_utils.py", line 458, in call
return self.preprocess(images, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/models/pix2struct/image_processing_pix2struct.py", line 390, in
ValueError: A header text must be provided for VQA models.
As the Pix2struct specific image processing function demands the image and header_text as inputs, but the standard VQA pipeline image processer only passes the image bit.
I am using:
transformers_version='4.28.1',
pytorch_version='2.0.0',
py_version='py310',
and, after deployment, calling the predictor with:
predictor.predict({
"image": "https://9to5mac.com/wp-content/uploads/sites/6/2019/04/Screen-Shot-2019-04-18-at-11.29.01-AM.png?resize=1024,746",
"question": text
})