mSimCSE

#6
by KnutJaegersberg - opened
Massive Text Embedding Benchmark org

Yeah would be cool to know, feel free to benchmark them - We can automatically add the scores to the leaderboard once we have the result files.
Maybe @yaushian is also interested in benchmarking :)

:)
I just discovered them whilst scouting, I don't yet know how to run the benchmarks.

https://github.com/YJiangcm/PromCSE by @YuxinJiang
Is suggested to be a new English SOTA model, should be interesting to compare, too.

Massive Text Embedding Benchmark org

Nice find! Yeah would be great to have them!
Here's a simple script for running: https://github.com/embeddings-benchmark/mteb/blob/main/scripts/run_mteb_english.py & instructions for adding to the LB are here: https://github.com/embeddings-benchmark/mteb#leaderboard
For SimCSE-like models, probably something like the wrapper here needs to be used: https://github.com/embeddings-benchmark/mtebscripts/blob/9f82086299d939900d1bedfe6c5551efae2145ce/run_array_simcse.py#L109

Running it for mSimCSE-mono. I would mostly be interested in multilingual tests, there are some in AmazonCounterfactualClassification. So far I only got English results, does it skip multilingual tasks? I peeked at the classification results, English data. It's mixed, sometimes it is the best option among the multilingual ones, other times, not, but never bad. Not necessarily sota in the English classification category (multilingual embeddings subset), but not bad, very competitive. Not far from it I'd guess, if English results reflect what one can expect in multilingual settings, which is the key point of their paper: mSimCSE-mono with xlm-roberta was trained on English data only, great results in other languages, usually outperforming cross lingual trained version.
How can I run only multilingual classification tests tests?
The comprehensive benchmarks take a long time to run, I guess (i.e. clustering). Not sure I can let them finish today.
Can I submit partial results, too? Or maybe I run a set of tasks once a weekend over a several weeks.

Massive Text Embedding Benchmark org

You can specify languages as follows: evaluation = MTEB(tasks=["AmazonCounterfactualClassification"], task_langs=["en"]) - Here it will only run English. If you leave task_langs empty it will run all languages by default. You can check the available languages e.g. here for AmazonCounterfactualClassification.

Yes you can submit partial results like many other models in the benchmark currently are only partial!
Just add them to the metadata of the model you ran. You probably need to open a PR on https://huggingface.co/yaushian/mSimCSE/discussions and then @yaushian needs to merge it. If @yaushian is not available to merge it, you can also just copy the model to your account and add the metadata there.

Sign up or log in to comment