Bias/Fairness evaluation unclear

by kmargatina - opened Sep 28, 2022

Sep 28, 2022

How can I reproduce the results for the bias/fairness evaluation? It is not clear from the paper how you cast CrowS-Pairs, WinoGender and WinoBias as classification tasks. Did you use specific templates for these tasks?

"For each dataset, we evaluate between 5 and 10 prompts.": What does this mean?

Thank you in advance!

VictorSanh

BigScience Workshop org Sep 29, 2022

Hi @kmargatina ,

You will find the prompts used for the bias&fairness evaluations directly on promptsource (https://github.com/bigscience-workshop/promptsource).
If you want to limit variance and risk of version mismatch with the numbers reported in the card, I would recommend taking the v0.1 or v0.2 version of the repo.

Victor

kmargatina

Sep 29, 2022

Thank you Victor!

VictorSanh changed discussion status to closed Sep 29, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment