Inference Issue
I am a student learning to use transformers on huggingface.
I am facing an issue in creating a Hosted Inference API , as the pipeline gives an error of "Unidentified feature_extractor" while building the pipeline. So to fix this issue I manually made changes in
preprocessor_config.json
as it was containing "image_processor_type": "ViTImageProcessor" .
I crosschecked with your file and it shows "feature_extractor": "ViTFeatureExtractor" .Another issue is if I manually change the file and the pipeline is built then while calling the pipeline
"captioner("image.jpg")"
, it throws an error sayingpreprocess_fn() got an unexpected keyword argument 'images'
I am quite new to Pytorch and Huggingface, it would be a great help if you could help me with this issue.
Thank you
- I have checked this, I am also getting same error, this is because of hugging face is changing the feature extractor with image processor, you may see https://github.com/huggingface/transformers/blob/7032e0203262ebb2ebf55da8d2e01f873973e835/src/transformers/models/vit/feature_extraction_vit.py#L29
I feel there is an error in package that why we are getting this.
Solution would be for now just change "feature_extractor_type": "ViTFeatureExtractor", as you already did and able to load.
- For the 2nd issue I am not getting any error in inferencing with pipeline. Feels like again it is a package version problem.
Can I load a previous version of the pipeline package and solve this issue?
yes right, it is very small set.
You need to train on larger set.
Thank you so much!
I had this issue for a week and today I got it solved!
Thanks again ππ»
To train on a large set, you can use a torch data iterator.
import torch
from PIL import Image
class ImageCapatioingDataset(torch.utils.data.Dataset):
def __init__(self, ds, ds_type, max_target_length):
self.ds = ds
self.max_target_length = max_target_length
self.ds_type = ds_type
def __getitem__(self, idx):
image_path = self.ds[self.ds_type]['image_path'][idx]
caption = self.ds[self.ds_type]['caption'][idx]
model_inputs = dict()
model_inputs['labels'] = self.tokenization_fn(caption, self.max_target_length)
model_inputs['pixel_values'] = self.feature_extraction_fn(image_path)
return model_inputs
def __len__(self):
return len(self.ds[self.ds_type])
# text preprocessing step
def tokenization_fn(self, caption, max_target_length):
"""Run tokenization on caption."""
labels = tokenizer(caption,
padding="max_length",
max_length=max_target_length).input_ids
return labels
# image preprocessing step
def feature_extraction_fn(self, image_path):
"""
Run feature extraction on images
If `check_image` is `True`, the examples that fails during `Image.open()` will be caught and discarded.
Otherwise, an exception will be thrown.
"""
image = Image.open(image_path)
encoder_inputs = feature_extractor(images=image, return_tensors="np")
return encoder_inputs.pixel_values[0]
train_ds = ImageCapatioingDataset(ds, 'train', 64)
eval_ds = ImageCapatioingDataset(ds, 'validation', 64)
# instantiate trainer
trainer = Seq2SeqTrainer(
model=model,
tokenizer=feature_extractor,
args=training_args,
compute_metrics=compute_metrics,
train_dataset=train_ds,
eval_dataset=eval_ds,
data_collator=default_data_collator,
)