low performance on this checkpoint

#1
by pxyu - opened

Hi,

I am doing some experiments with the BGE-M3 family of models to test the impacts of unsupervised pre-training. Here are some results (R@100 on MIRACL):

MODEL DE EN ES
XLMR + 60M CC News data 722 721 763
BGE RETRO + 60M CC News data 772 774 789
BGE Unsupervised (this repo) 727 758 668
BGE M3 908 907 902

It is obvious that the third row BGE Unsupervised is kind of an outlier here, because the unsupervised pre-training done on your side seem worse than 60M datapoints training on my side. I wonder if you uploaded the wrong checkpoint or that I am not using/evaluating this checkpoint correctly.

Thanks.

Sign up or log in to comment