How does the model behaves in sentences that contains both English and Hebrew words?

by ShaniHadar - opened Jul 31, 2024

Jul 31, 2024

•

edited Jul 31, 2024

I tested the model with Hebrew sentences and the similarity results were great. I was wondering, will I be able to use the vectors to calculate similary if the sentences are composed from both English and Hebrew words?

imvladikon

Owner Jul 31, 2024

Hi Shani,
Thank you!

Unfortunately, I believe that for a cross-lingual setup, it would not work well because I initially used AlephBert as the backbone model and fine-tuned it on a pairwise Hebrew sentence similarity downstream task. However, it's an interesting idea, and I might consider producing such a model in the future. For now, I would suggest checking out some cross-lingual models like LaBSE, e5-multilingual, etc., for that purpose.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment