simp_demo / configs /weat.yaml
Avijit Ghosh
removed csv, added support for datasets
0c7d699
raw
history blame
2.57 kB
Abstract: "Artificial intelligence and machine learning are in a period of astounding\
\ growth. However, there are concerns that these\ntechnologies may be used, either\
\ with or without intention, to perpetuate the prejudice and unfairness that unfortunately\n\
characterizes many human institutions. Here we show for the first time that human-like\
\ semantic biases result from the\napplication of standard machine learning to ordinary\
\ language\u2014the same sort of language humans are exposed to every\nday. We replicate\
\ a spectrum of standard human biases as exposed by the Implicit Association Test\
\ and other well-known\npsychological studies. We replicate these using a widely\
\ used, purely statistical machine-learning model\u2014namely, the GloVe\nword embedding\u2014\
trained on a corpus of text from the Web. Our results indicate that language itself\
\ contains recoverable and\naccurate imprints of our historic biases, whether these\
\ are morally neutral as towards insects or flowers, problematic as towards\nrace\
\ or gender, or even simply veridical, reflecting the status quo for the distribution\
\ of gender with respect to careers or first\nnames. These regularities are captured\
\ by machine learning along with the rest of semantics. In addition to our empirical\n\
findings concerning language, we also contribute new methods for evaluating bias\
\ in text, the Word Embedding Association\nTest (WEAT) and the Word Embedding Factual\
\ Association Test (WEFAT). Our results have implications not only for AI and\n\
machine learning, but also for the fields of psychology, sociology, and human ethics,\
\ since they raise the possibility that mere\nexposure to everyday language can\
\ account for the biases we replicate here."
Applicable Models: .nan
Authors: Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan
Considerations: Although based in human associations, general societal attitudes do
not always represent subgroups of people and cultures.
Datasets: .nan
Group: BiasEvals
Hashtags:
- Bias
- Word Association
- Embeddings
- NLP
Link: Semantics derived automatically from language corpora contain human-like biases
Modality: Text
Screenshots:
- Images/WEAT1.png
- Images/WEAT2.png
Suggested Evaluation: Word Embedding Association Test (WEAT)
Type: Model
URL: https://researchportal.bath.ac.uk/en/publications/semantics-derived-automatically-from-language-corpora-necessarily
What it is evaluating: Associations and word embeddings based on Implicit Associations
Test (IAT)