Code for evaluating new models?

#620
by YannDubs - opened

Hi, is the exact script used to run the open_lm_leaderboard open-sourced? I only found vague commands that suggest using lm_eval.

Thanks!

Open LLM Leaderboard org

Hi!
You can find the precise steps to reproduce our evaluations in the About tab - the Open LLM Leaderboard uses the harness for evaluation.

clefourrier changed discussion status to closed

Sign up or log in to comment