Question about MLDR Evaluation Metrics in ModernBERT Paper

#62
by WoutDeRijck - opened

Hi, I'm working with the MLDR dataset and trying to reproduce the results from the ModernBERT paper. In Table 3, they report an MLDR-EN score of 44.0 for their model, but I'm getting different metrics (for MLDRO_OD):

  • MRR@10: 0.746
  • NDCG@10: 0.781
  • Accuracy@1: 0.670
  • MAP@10: 0.746

This is after training on MS MARCO and evaluating on MLDR-EN dev set. I'm using the InformationRetrievalEvaluator from sentence-transformers.

Could someone clarify:

  1. Which metric was used for the 44.0 score in the paper?
  2. Is there a specific evaluation setup I should be using for MLDR?

Thanks in advance!
https://github.com/AnswerDotAI/ModernBERT/issues/193

Sign up or log in to comment