Is it possible to update the benchmark for `jina-embeddings-v2` using the `-zh` model?

#4
by nan - opened

assets_rag_eval_multiple_domains_summary.jpg

We notice that this benchmark is using the dataset at https://huggingface.co/datasets/maidalun1020/CrosslingualMultiDomainsDataset, which is a dataset in Chinese and English. However, the jina-embeddings-v2-en model is a monolingual model dedicated to English. It would be great if you could update the benchmark using https://huggingface.co/jinaai/jina-embeddings-v2-base-zh

For reference,
english model: https://huggingface.co/jinaai/jina-embeddings-v2-base-en
english-chinese model: https://huggingface.co/jinaai/jina-embeddings-v2-base-zh
english-german model: https://huggingface.co/jinaai/jina-embeddings-v2-base-de

Thank you for your suggestion! We will remove "jina-embeddings-v2-en" in this leaderboard and look forward to the bilingual one "jina-embeddings-v2-base-zh".

maidalun1020 changed discussion status to closed

Sign up or log in to comment