The model keeps outputting "pass" for questions in HumanEval

#23
by mz227 - opened

I am trying to use Huggingface transformers lib with model.generate to replicate the experimental results on the HumanEval dataset.
However, the model keeps outputting "pass" for nearly all the problems.

I am not sure whether the tokenizer or generation settings of the HF and the Mistral versions are identical.
Did anyone have the same experience? I would like to know how to solve it.

I understand that the mistral-inference lib could generate desirable results, however, I wants to make some modifications on the mistral models, which requires me to make the results reproduceable on the default huggingface training/inference pipeline.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment