Text Generation
Transformers
English
phi3_v
Embedding
custom_code

RAG Pipeline Creation

#1
by Vasanth - opened

Is it possible to create a RAG pipeline with this model as an alternative to ColPali series of model on a corpus of documents?

TIGER-Lab org

@Vasanth
Thanks for your interest in our work!
VLM2Vec is a general multimodal representation model, with some of its training data being similar to the Text-Image retrieval task. Additionally, our evaluation dataset includes subsets adapted from existing document retrieval dataset, such as https://huggingface.co/datasets/Tevatron/wiki-ss-nq. Therefore, we believe VLM2Vec could be utilized in a RAG pipeline. We also plan to conduct further testing in this direction in the future (for example compare with Colpali on document retrieval task).

wenhu changed discussion status to closed

Sign up or log in to comment