Riiid/sheep-duck-llama-2-70b-v1.1 disappeared from the leaderboard

#366
by l-yohai - opened

I checked that Riiid/sheep-duck-llama-2-70b-v1.1 and Riiid/sheep-duck-llama-2 were submitted two days ago, and the status changed from requests to FINISHED.
However, I cannot find the result file, and the model has disappeared from the leaderboard.

Could you please check on it? @SaylorTwift

Open LLM Leaderboard org

Hi!
We had a small results upload problem on some models a couple days ago, we must have missed this one.
Models are not displayed on the leaderboard if not all results are available, and the new evals for this one have not yet been pushed to the details files (which you'll find here and here).
We'll fix it asap!

Open LLM Leaderboard org

Hi @l-yohai , your model is being run at the moment you will see it appear soon !
Thank you for your patience :)

SaylorTwift changed discussion status to closed

Hi!
After re-submitting the models, the model still looks not re-evaluated properly.
I found the 13b model only from the leaderboard (though I found no differences between the requests files.)

l-yohai changed discussion status to open
Open LLM Leaderboard org

Hi! Your model is still running, please be patient - it's a big model, it's taking several days to evaluate.

Open LLM Leaderboard org

I'm going to close this issue, feel free to reopen it if the model does not appear

clefourrier changed discussion status to closed

After raising the issue for the first time 12 days ago, I waited and re-submitted, and the status was changed to FINISHED 4 days ago.
https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/Riiid/sheep-duck-llama-2-70b-v1.1_eval_request_False_float16_Original.json

"status": "FINISHED" (4 days ago)

I waited a long time, but I couldn't find the result itself, and the last time it was evaluated even 2 months ago. (https://huggingface.co/datasets/open-llm-leaderboard/details_Riiid__sheep-duck-llama-2-70b-v1.1/tree/main, https://huggingface.co/datasets/open-llm-leaderboard/results/tree/main/Riiid/sheep-duck-llama-2-70b-v1.1)

I want you to tell me what's going on and how much longer I have to wait. @clefourrier @SaylorTwift

Open LLM Leaderboard org

Hi!

I want you to tell me what's going on and how much longer I have to wait.

You know we are actual human beings on the other side of the screen, right? I'm doing my best to be courteous and kind, it would be nice if you could do the same.

To answer your actual question (thank you vm for pointing out the request file), I checked the log, and your job was cancelled, then not relaunched yet (for priority reasons) - we've got around 30 jobs (including yours) that are waiting for space on the cluster. I have no idea when there will be space, the leaderboard does not have priority on the research cluster.

I think the status was changed to FINISHED because you already have a result file from a prior run, and our backend checks if the files exist to change the status. This is a bug, we'll fix it.

I didn't mean any harm, and I apologize if I came across as a bit sharp. It was just out of frustration from the wait.
I appreciate your response and am thankful for the kind way you've communicated. I'll look forward to the update.
Thank you for your hard work. @clefourrier

Open LLM Leaderboard org

No hard feelings :)

I also 100% understand that it's very frustrating to wait - tbh, the new priority system on the cluster has caused quite a lot of waiting for us too, and extra work to adapt our logging/scheduling system (still needs to be updated for some edge cases apparently 😅 ), and it's also a bit annoying to have these delays for the users.
At the same time, it's way more fair for all the researchers doing good work on their model training projects to not have to wait for leaderboard jobs to finish before launching big models training.

Thank you, @clefourrier , for the detailed explanation.

I understand now that there are constraints and prioritization challenges on the research cluster, and I appreciate the efforts your team is making to manage these tasks fairly.

I'll continue to wait for the update and am grateful for your team's hard work in handling these complex processes.
If there's anything I can contribute, please let me know. I'd like to always be of help to this tremendous project.

Thank you again for your assistance and patience.

Sign up or log in to comment