metadata

library_name: peft
license: apache-2.0

Framework versions

PEFT 0.5.0

Model Card for MCQ-Classifier-MMLU-EFG

MCQ-Classifier is a parameter-efficient finetuned 7B Mistral-7b-base-v0.1 to automatically detect the model answers to Multiple Choice Questions.

This model is trained on annotated model outputs to MMLU dataset. We collected responses from Llama2-7b-chat, Llama2-13b-chat and Mistral-7b-Inst-v0.2

For full details of this model please read our paper.

"EFG"

During our annotation phase, we noticed that models may not choose the available answer candiates but refuse to answer or claim "No correct answer available." Therefore, we consider other three cases "Refusal", "No correct answer", "I don't know" and add those three options into the answer candidates, extending the option range from "A-D" to "A-G". Note that we shuffle the oder of the options in our dataset, therefore, "EFG" does not necessarily correspond to "Refusal", "No correct answer" and "I don't know".

Also note that, if the model refuse to answer due to safety reason, the answer will be mapped to the refuse option such as "D. Refused".

Run the model

Your should construct your input into such format: model_reponse + "\nReferences:" + references + "\nAnswer:"

For example:

inputs = ' Sure! I can help you with that. The answer to the question is:\n\nB. Frederick Taylor \nReferences: \nA. Lillian Gilbreth \nB. Frederick Taylor \nC. No correct answer is given \nD. I do not know \nE. Refused \nF. Mary Parker Follett \nG. Elton Mayo \nAnswer:'

then feed it to the classifier:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig
config = PeftConfig.from_pretrained("mainlp/MCQ-Classifier-MMLU-EFG")
base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
model = PeftModel.from_pretrained(base_model, "mainlp/MCQ-Classifier-MMLU-EFG")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
to_classify = f"""<s>[INST] Classify the response.{inputs} [/INST]"""
model_input = tokenizer(to_classify, return_tensors="pt")
output =  merged_model.generate(**model_input, max_new_tokens=1, do_sample=False)
print(tokenizer.decode(output.sequences[0], skip_special_tokens=True))

Cite

@article{wang2024my,
  title={" My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models},
  author={Wang, Xinpeng and Ma, Bolei and Hu, Chengzhi and Weber-Genzel, Leon and R{\"o}ttger, Paul and Kreuter, Frauke and Hovy, Dirk and Plank, Barbara},
  journal={arXiv preprint arXiv:2402.14499},
  year={2024}
}

@article{wang2024look,
  title={Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You Think},
  author={Wang, Xinpeng and Hu, Chengzhi and Ma, Bolei and R{\"o}ttger, Paul and Plank, Barbara},
  journal={arXiv preprint arXiv:2404.08382},
  year={2024}
}