Is ModernBERT already fine-tuned for IR tasks?
Hi there, I was wondering if ModernBERT has already been fine-tuned on IR tasks
Hello!
Yes, about a dozen times. See here a filter with Sentence Transformer models & ModernBERT architecture: https://huggingface.co/models?library=sentence-transformers&other=modernbert
I'm not sure which model is the best (it also depends on your use case), perhaps https://huggingface.co/NohTow/ModernBERT-base-DPR-fullneg-gte-0.0002 ?
But I suspect that none so far have reached the same performance as some of the models reported on MTEB. We'll have to wait a bit longer for that.
- Tom Aarsen
Thank you for the information and the links!
I was also curious about the base versions, “answerdotai/ModernBERT-base” and “answerdotai/ModernBERT-large”—have these models already been fine-tuned for IR tasks, such as using contrastive loss?
It will certainly be interesting to see the progress of ModernBert as folk get the opportunity to leverage the architectural benefits. Just how much of a difference they will make at scale and what opportunities they open up for downstream tasks is yet to be understood.
Hello,
As discussed here, we decided not to chase the MTEB leaderboard (which is a dedicated project in itself) and let the community apply their recipes on ModernBERT to get competitive models. That is why we did not release the models we trained for the paper experiments.
As expected, we are starting to see competitive models being built on top of ModernBERT, such as modernbert-embed-base of
@zpn
!
It has been added to MTEB leaderboard and is ranked 56 across all models size, and 7 for model < 250M parameters (it is 149M). This is amazing work done in such a short time window, so I we can only suppose more is yet to come!