Spaces:
Running
Instruction following question
I have a question concering the benchmark because I don't know if I really understand this correctly. Abliterated models like QuantFactory/NeuralDaredevil-8B-abliterated-GGUF are scoring very high on the benchmark. They are abliterated in order to be uncensored. This is increasing the score in ifeval insanely so that the model scores above commandr + while having less than 10% of its parameters. Are in the dataset of the ifeval ethical problematic questions, which would've been refused by the base instruction model? This would explain the insanely high score of the nueraldaredevil. Because if there aren't any direct ethical and moral problems in the dataset and just questions of the kind "repeat Apple three times and then answer" this would mean, that the ethical and moral fine tuning is at the same time disrupting the instruction following capacity of the model.
So the question I'm having is: is there moral and ethical problematic data in the dataset? Does anybody know if the ethical and moral alignment is screwing up the instruction following capacity of a model?