Question about MLDR Evaluation Metrics in ModernBERT Paper
#62
by
WoutDeRijck
- opened
Hi, I'm working with the MLDR dataset and trying to reproduce the results from the ModernBERT paper. In Table 3, they report an MLDR-EN score of 44.0 for their model, but I'm getting different metrics (for MLDRO_OD):
- MRR@10: 0.746
- NDCG@10: 0.781
- Accuracy@1: 0.670
- MAP@10: 0.746
This is after training on MS MARCO and evaluating on MLDR-EN dev set. I'm using the InformationRetrievalEvaluator
from sentence-transformers.
Could someone clarify:
- Which metric was used for the 44.0 score in the paper?
- Is there a specific evaluation setup I should be using for MLDR?
Thanks in advance!
https://github.com/AnswerDotAI/ModernBERT/issues/193