Bias/Fairness evaluation unclear
#3
by
kmargatina
- opened
How can I reproduce the results for the bias/fairness evaluation? It is not clear from the paper how you cast CrowS-Pairs, WinoGender and WinoBias as classification tasks. Did you use specific templates for these tasks?
"For each dataset, we evaluate between 5 and 10 prompts.": What does this mean?
Thank you in advance!
Hi @kmargatina ,
You will find the prompts used for the bias&fairness evaluations directly on promptsource (https://github.com/bigscience-workshop/promptsource).
If you want to limit variance and risk of version mismatch with the numbers reported in the card, I would recommend taking the v0.1 or v0.2 version of the repo.
Victor
Thank you Victor!
VictorSanh
changed discussion status to
closed