Problem evaluating 72B, please help

#1117
by Marsouuu - opened

I tried several times to submit a merged 72B model on the leaderboard, but the evaluation doesn’t go through, even though inference works fine on the resulting model.

Sorry again for bothering you, @clefourrier . Every time we try to submit this merged model :

https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/Baptiste-HUVELLE-10/LeTriomphant2.2_ECE_iLAB_eval_request_False_bfloat16_Original.json

the leaderboard fails the evaluation. However, when we manually test the inference, it works fine without hallucinations.

Did we miss something?

Thank you very much in advance,

Open LLM Leaderboard org

Hi @Marsouuu ,

Thank you for providing the link to the request file!

According to the log, it was a CUDA error. I'll look at the model evaluation manually and get back to you when I get the results

Hello and thank you very much for your time, @alozowski ,

I’m looking forward to hearing back from you soon so we can fix it if the issue is on our end. 😁

Thanks again!

Hello @alozowski ,
I am part of the study team working with @Marsouuu so I would like to contact you to find out if you were able to evaluate our model or if you had first results to send to us?
Thank you in advance
Have a good day

Open LLM Leaderboard org

Hi to you two!
We'll keep you posted once it's good, please be patient.

Sign up or log in to comment