Questions about model & architecture

by tomaarsen HF staff - opened Jun 24, 2024

Jun 24, 2024

•

edited Jun 24, 2024

Hello!

I just stumbled upon this when looking at recent Sentence Transformer models, and I think it's quite interesting to see the custom architecture (although I haven't yet figured out what's new about it compared to e.g. RoBERTa). Would you like to share some information about it?

I also wanted to let you know that Sentence Transformers recently had a big v3.0 update, which refactored the training. Old training scripts should mostly still work, but training can now also be done with a SentenceTransformerTrainer that resembles the transformers Trainer, in case you're familiar with that one. Notably, it's now much easier to track the performance of your model during training, via Weights and Biases/Tensorboard integrations and better callbacks. I think it might be quite useful for you. The updated training documentation can be found here: https://sbert.net/docs/sentence_transformer/training_overview.html

The produced model cards are also much more meaningful, see e.g. other recent Sentence Transformer models like cristuf/bge-base-financial-matryoshka.

Also, once your model is ready for people to use, then feel free to reach out and I can share the word on the socials.

cc @dangvantuan

Tom Aarsen

dangvantuan

La Javaness org Jun 24, 2024

Hi @tomaarsen
I am using the XLMRoberta architecture but training it only for French and English, so I have customized it into a Bilingual model for these languages. The model is still in the experimental step. I am currently training NLI and will share it with you soon. I am also experimenting with Sentence Transformer v3.0.
Tuan

tomaarsen

Jun 24, 2024

•

edited Jun 24, 2024

so I have customized it into a Bilingual model for these languages.

Out of curiosity, have you trained a custom tokenizer on English/French data? The XLM-R default tokenizer has a lot of tokens that you won't end up using that'll 1) slow down inference and 2) potentially reduce your performance.

I'm glad that you've discovered Sentence Transformers v3.0, I like to think that it can help make your life a bit easier.
I'll happily follow your progress along.

Tom Aarsen

dangvantuan

La Javaness org Jun 29, 2024

Hi @tomaarsen
I checked the MTEB leaderboard but only saw Ranking Average (2 datasets) and Summarization Average (1 dataset) displayed.

The metrics of other tasks are not displayed. Could I ask you where the cause comes from?
Thank you.
Tuan

tomaarsen

Jun 29, 2024

Heya!
I'm OOO now so it's a bit hard to tell, but it might be possible to figure it out by going to the individual tabs and seeing where 1) this model is missing and/or 2) what tasks exist that you don't seem to have scores for. That might be a good start.

Tom Aarsen

dangvantuan

La Javaness org Jul 21, 2024

Hi @tomaarsen
Could you refresh leaderboard mteb https://huggingface.co/spaces/mteb/leaderboard, please?
Thank you so much!
Tuan

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment