haoranxu/ALMA-13B-R · Is there any comparison with Google's MADLAD-400?

Jan 22

•

I did not see any comparison with Google's MADLAD-400 10B model but do you consider any evaluation against it?
It is a Seq2Seq model so a bit different architecture but it should be interesting to get the results as the parameters size is relatively similar.
Here is the HuggingFace model: https://huggingface.co/google/madlad400-10b-mt

haoranxu

Owner Jan 22

•

edited Jan 22

Thanks for your suggestion! We will include the results of MADLAD-400 10B as soon as possible!

haoranxu

Owner Jan 24

Hi, we have tested MADLAD-400 10B on both WMT'23 and WMT'22. We list the averaged results here in advance and will show detailed results in our next version of arxiv.

WMT 23:

	BLEU	COMET22	COMETkiwi22	COMET-kiwi-10B	XCOMET-10B
ALMA-13B-R	30.75	84.04	80.55	78.97	89.74
MADLAD-10B	33.33	81.48	77.87	72.02	84.84

WMT 22:

	xx-en					en-xx
	BLEU	COMET22	COMETkiwi22	COMET-kiwi-10B	XCOMET-10B	BLEU	COMET22	COMETkiwi22	COMET-kiwi-10B	XCOMET-10B
ALMA-R	35.45	85.21	81.33	82.43	89.11	27.03	87.74	83.34	85.74	94.05
MADLAD-10B	37.45	84.50	80.48	80.51	87.18	33.85	85.42	80.89	79.46	89.10

haoranxu

Owner Jan 24

A more straightforward comparison via the figure:

AntoineBlanot

Jan 25

Thank you very much!