Is there any comparison with Google's MADLAD-400?

#3
by AntoineBlanot - opened

I did not see any comparison with Google's MADLAD-400 10B model but do you consider any evaluation against it?
It is a Seq2Seq model so a bit different architecture but it should be interesting to get the results as the parameters size is relatively similar.
Here is the HuggingFace model: https://huggingface.co/google/madlad400-10b-mt

Thanks for your suggestion! We will include the results of MADLAD-400 10B as soon as possible!

Hi, we have tested MADLAD-400 10B on both WMT'23 and WMT'22. We list the averaged results here in advance and will show detailed results in our next version of arxiv.

WMT 23:

BLEU COMET22 COMETkiwi22 COMET-kiwi-10B XCOMET-10B
ALMA-13B-R 30.75 84.04 80.55 78.97 89.74
MADLAD-10B 33.33 81.48 77.87 72.02 84.84

WMT 22:

xx-en en-xx
BLEU COMET22 COMETkiwi22 COMET-kiwi-10B XCOMET-10B BLEU COMET22 COMETkiwi22 COMET-kiwi-10B XCOMET-10B
ALMA-R 35.45 85.21 81.33 82.43 89.11 27.03 87.74 83.34 85.74 94.05
MADLAD-10B 37.45 84.50 80.48 80.51 87.18 33.85 85.42 80.89 79.46 89.10

A more straightforward comparison via the figure:
almar.png

Thank you very much!

Sign up or log in to comment