Spaces:
Runtime error
Multiple Question Input
Hi, nielsr previously mentioned that it was possible to do multiple question inputs by "sending a batch of images + questions through the model... provide a batch of pixel_values + decoder_input_ids to the generate method, and use the batch_decode method of the tokenizer to turn the generated ID's into text."
Does anyone have an example of this or a similar notebook that details more about how to do this? Thank you.
Hi,
I created a notebook to illustrate that: https://colab.research.google.com/drive/1oOgGwT-I51rTcA9f2CTB3I7nX1FxEsQj?usp=sharing.
Currently you're sending the same prompt (decoder_input_ids) twice through the model. For VQA, the prompt needs to be different per example. I'll check this tomorrow
I see, sounds good! If there is any way to input multiple questions for a single image, that would be awesome as well.
I've updated the notebook to reflect this. To send multiple questions to a single image, you can just duplicate the image several times in the notebook rather than using different images.
Gotcha, thank you so much for your help!