How does the model behaves in sentences that contains both English and Hebrew words?

#1
by ShaniHadar - opened

I tested the model with Hebrew sentences and the similarity results were great. I was wondering, will I be able to use the vectors to calculate similary if the sentences are composed from both English and Hebrew words?

Hi Shani,
Thank you!

Unfortunately, I believe that for a cross-lingual setup, it would not work well because I initially used AlephBert as the backbone model and fine-tuned it on a pairwise Hebrew sentence similarity downstream task. However, it's an interesting idea, and I might consider producing such a model in the future. For now, I would suggest checking out some cross-lingual models like LaBSE, e5-multilingual, etc., for that purpose.

Sign up or log in to comment