[Help needed] Re-labelling models to separate different kinds of fine-tuning

#160
by clefourrier HF staff - opened
Open LLM Leaderboard org

@jaspercatapang suggested we should separate instruction-tuned from (vanilla) fine-tuned, and I agree!

If you want to give a hand, please open a PR and change the information in the TYPE_METADATA dict in this file, and I'll merge it asap!

Re-labelled the models in this PR here. It might need some reformatting.

For consistency, I followed a simple guide:

  1. If the model type is either pre-trained/RL, retain it.
  2. If the model card mentions that it follows instructions, then the new model type is instruction-tuned
  3. If the model card makes no reference to instruction-following, then the new model type is fine-tuned

If there are errors in my re-labelling, please open a PR to modify it. Thank you.

Open LLM Leaderboard org

That's amazing, thank you!
I'll leave your PR open for the week to in case the community wants to comment on it/adjust, and merge it on Friday!

clefourrier pinned discussion

My only suggestion is that maybe there should be a "dialog-tuned" category. Instruction tuning does not imply tuning for multi turn dialog or aka "chat". RLHF almost always means dialog tuned. I am not aware of anyone doing RLHF for something not a chat model. Essentially instruction tuning alone implies a single turn dialog; One instruction - one response. If model card says we made a chat model or we tuned for dialog that implies more than just instruction tuning.

clefourrier changed discussion status to closed
clefourrier unpinned discussion

Sign up or log in to comment