Performance on MTEB

#4
by JustJaro - opened

Hi,

I was wondering if using this model for both text and image embeddings would degrade text performance; from the benchmarks its not quite clear how it stands on MTEB.

Could you shed some light on it? Is it better/worse than for example intfloat/e5-mistral-7b-instruct?

Thanks for your help.

Cheers,
Jaro

TIGER-Lab org

@JustJaro

This is a great question. Currently, we haven’t tested it yet, but it is part of our plan.
I expect that the results on MTEB may not be as strong as the current state-of-the-art text embedding models, as we haven't trained on any text-only data. One of our key next steps is to combine both text and current image pairwise data and train a model. We believe that incorporating more text pairwise data could also benefit image-related tasks, based on insights from other literature (such as E5-v).

Thanks for your answer. Sounds like a great plan - and cheers for the great work on the embedding model!

ziyjiang changed discussion status to closed

Sign up or log in to comment