openbmb/UltraLM-65b 's score is worse than "meta-llama/Llama-2-70b-hf" but leaderboard say it is better

#171
by mhemetfaik - opened

openbmb/UltraLM-65b 's scores:

67.1 + 85 + 63.5 + 53.5 = 269.1

meta-llama/Llama-2-7 's scores:

67.3 + 87.3 + 69.8 + 44.9 = 269.3

Leaderboard screenshot:

image.png

Open LLM Leaderboard org

Hi @mhemetfaik !
That's a super good point! We are ordering after the rounding up, I'll fix it to prevent these edge cases.

Open LLM Leaderboard org
edited Aug 7, 2023

Hi!
I changed the rounding to 2 decimal points, which should fix most cases.
We'll still get edge cases (hopefully very rarely), as separating the displayed rounding and input numbers is not yet possible in gradio - I also opened an issue there, and I'll port their modif once it's done.

Thank you very much for raising!

clefourrier changed discussion status to closed

Thanks for your interest. Like you said this will fix most cases and thank you for fixing it so quickly.

Sign up or log in to comment