Spaces:

evaleval
/

simp_demo

Sleeping

App Files Files Community

Avijit Ghosh commited on Apr 18, 2024

Commit

61b63e9

2 Parent(s): 6c70e1e 2ec74bb

merged

Browse files

Files changed (24) hide show

Images/Forgetting1.png +0 -0
Images/Forgetting2.png +0 -0
Images/SLD1.png +0 -0
Images/SLD2.png +0 -0
Images/TANGO1.png +0 -0
Images/TANGO2.png +0 -0
Images/WEAT1.png +0 -0
Images/WEAT2.png +0 -0
__pycache__/css.cpython-312.pyc +0 -0
app.py +119 -26
configs/crowspairs.yaml +1 -1
configs/homoglyphbias.yaml +0 -16
configs/honest.yaml +1 -1
configs/ieat.yaml +1 -1
configs/imagedataleak.yaml +1 -1
configs/measuringforgetting.yaml +19 -0
configs/notmyvoice.yaml +1 -1
configs/palms.yaml +14 -0
configs/safelatentdiff.yaml +17 -0
configs/stablebias.yaml +1 -1
configs/stereoset.yaml +0 -16
configs/tango.yaml +19 -0
configs/videodiversemisinfo.yaml +1 -1
configs/weat.yaml +1 -1

Images/Forgetting1.png ADDED Viewed

Images/Forgetting2.png ADDED Viewed

Images/SLD1.png ADDED Viewed

Images/SLD2.png ADDED Viewed

Images/TANGO1.png ADDED Viewed

Images/TANGO2.png ADDED Viewed

Images/WEAT1.png CHANGED Viewed

Images/WEAT2.png CHANGED Viewed

__pycache__/css.cpython-312.pyc CHANGED Viewed

Binary files a/__pycache__/css.cpython-312.pyc and b/__pycache__/css.cpython-312.pyc differ

app.py CHANGED Viewed

@@ -4,13 +4,31 @@ import pandas as pd
 from gradio_modal import Modal
 import os
 import yaml
 folder_path = 'configs'
 # List to store data from YAML files
 data_list = []
 metadata_dict = {}
 # Iterate over each file in the folder
 for filename in os.listdir(folder_path):
     if filename.endswith('.yaml'):
@@ -27,25 +45,24 @@ globaldf['Link'] = '<u>'+globaldf['Link']+'</u>'
 # Define the desired order of categories
 modality_order = ["Text", "Image", "Audio", "Video"]
-type_order = ["Model", "Dataset", "Output", "Taxonomy"]
-# Convert Modality and Type columns to categorical with specified order
 globaldf['Modality'] = pd.Categorical(globaldf['Modality'], categories=modality_order, ordered=True)
-globaldf['Type'] = pd.Categorical(globaldf['Type'], categories=type_order, ordered=True)
-# Sort DataFrame by Modality and Type
-globaldf.sort_values(by=['Modality', 'Type'], inplace=True)
 # create a gradio page with tabs and accordions
 # Path: taxonomy.py
-def filter_modality(filteredtable, modality_filter):
-    filteredtable = filteredtable[filteredtable['Modality'].isin(modality_filter)]
-    return filteredtable
-def filter_type(filteredtable, modality_filter):
-    filteredtable = filteredtable[filteredtable['Type'].isin(modality_filter)]
     return filteredtable
 def showmodal(evt: gr.SelectData):
@@ -55,6 +72,7 @@ def showmodal(evt: gr.SelectData):
     authormd = gr.Markdown("",visible=False)
     tagsmd = gr.Markdown("",visible=False)
     abstractmd = gr.Markdown("",visible=False)
     datasetmd = gr.Markdown("",visible=False)
     gallery = gr.Gallery([],visible=False)
     if evt.index[1] == 5:
@@ -67,6 +85,12 @@ def showmodal(evt: gr.SelectData):
                 tagstr = ''.join(['<span class="tag">#'+tag+'</span> ' for tag in tags])
                 tagsmd = gr.Markdown(tagstr, visible=True)
         titlemd = gr.Markdown('# ['+itemdic['Link']+']('+itemdic['URL']+')',visible=True)
         if pd.notnull(itemdic['Authors']):
@@ -83,7 +107,7 @@ def showmodal(evt: gr.SelectData):
             if len(screenshots) > 0:
                 gallery = gr.Gallery(screenshots, visible=True)
-    return [modal, titlemd, authormd, tagsmd, abstractmd, datasetmd, gallery]
 with gr.Blocks(title = "Social Impact Measurement V2", css=custom_css, theme=gr.themes.Base()) as demo: #theme=gr.themes.Soft(),
     # create tabs for the app, moving the current table to one titled "rewardbench" and the benchmark_text to a tab called "About"
@@ -96,18 +120,18 @@ with gr.Blocks(title = "Social Impact Measurement V2", css=custom_css, theme=gr.
         gr.Markdown("""
 #### Technical Base System Evaluations:
-Below we list the aspects possible to evaluate in a generative system. Context-absent evaluations only provide narrow insights into the described aspects of the type of generative AI system. The depth of literature and research on evaluations differ by modality with some modalities having sparse or no relevant literature, but the themes for evaluations can be applied to most systems.
 The following categories are high-level, non-exhaustive, and present a synthesis of the findings across different modalities. They refer solely to what can be evaluated in a base technical system:
                     """)
     with gr.Tabs(elem_classes="tab-buttons") as tabs1:
-        with gr.TabItem("Bias/Stereotypes"):
             fulltable = globaldf[globaldf['Group'] == 'BiasEvals']
-            fulltable = fulltable[['Modality','Type', 'Suggested Evaluation', 'What it is evaluating', 'Considerations', 'Link']]
             gr.Markdown("""
-            Generative AI systems can perpetuate harmful biases from various sources, including systemic, human, and statistical biases. These biases, also known as "fairness" considerations, can manifest in the final system due to choices made throughout the development process. They include harmful associations and stereotypes related to protected classes, such as race, gender, and sexuality. Evaluating biases involves assessing correlations, co-occurrences, sentiment, and toxicity across different modalities, both within the model itself and in the outputs of downstream tasks.
                         """)
             with gr.Row():
                 modality_filter = gr.CheckboxGroup(["Text", "Image", "Audio", "Video"],
@@ -116,17 +140,17 @@ The following categories are high-level, non-exhaustive, and present a synthesis
                                                  show_label=True,
                                                 #  info="Which modality to show."
                                                  )
-                type_filter = gr.CheckboxGroup(["Model", "Dataset", "Output", "Taxonomy"],
                                                  value=["Model", "Dataset", "Output", "Taxonomy"],
-                                                 label="Type",
                                                  show_label=True,
                                                 #  info="Which modality to show."
                                                  )
             with gr.Row():
-                biastable_full = gr.DataFrame(value=fulltable, wrap=True, datatype="markdown", visible=False, interactive=False)
-                biastable_filtered = gr.DataFrame(value=fulltable, wrap=True, datatype="markdown", visible=True, interactive=False)
-                modality_filter.change(filter_modality, inputs=[biastable_filtered, modality_filter], outputs=biastable_filtered)
-                type_filter.change(filter_type, inputs=[biastable_filtered, type_filter], outputs=biastable_filtered)
                 with Modal(visible=False) as modal:
@@ -138,25 +162,94 @@ The following categories are high-level, non-exhaustive, and present a synthesis
                     gr.Markdown("### What it is evaluating", visible=True)
                     gr.Markdown('## Resources', visible=True)
                     gr.Markdown('### What you need to do this evaluation', visible=True)
                     datasetmd = gr.Markdown(visible=False)
                     gr.Markdown("## Results", visible=True)
                     gr.Markdown("### Metrics", visible=True)
                     gallery = gr.Gallery(visible=False)
-                biastable_filtered.select(showmodal, None, [modal, titlemd, authormd, tagsmd, abstractmd, datasetmd, gallery])
         with gr.TabItem("Cultural Values/Sensitive Content"):
             with gr.Row():
-                gr.Image()
         # with gr.TabItem("Disparate Performance"):
         #     with gr.Row():
         #         gr.Image()
         with gr.TabItem("Privacy/Data Protection"):
             with gr.Row():
-                gr.Image()
         # with gr.TabItem("Financial Costs"):
         #     with gr.Row():

 from gradio_modal import Modal
 import os
 import yaml
+import itertools
 folder_path = 'configs'
 # List to store data from YAML files
 data_list = []
 metadata_dict = {}
+def expand_string_list(string_list):
+    expanded_list = []
+    # Add individual strings to the expanded list
+    expanded_list.extend(string_list)
+    # Generate combinations of different lengths from the input list
+    for r in range(2, len(string_list) + 1):
+        combinations = itertools.combinations(string_list, r)
+        for combination in combinations:
+            # Generate permutations of each combination
+            permutations = itertools.permutations(combination)
+            for permutation in permutations:
+                expanded_list.append(' + '.join(permutation))
+    return expanded_list
 # Iterate over each file in the folder
 for filename in os.listdir(folder_path):
     if filename.endswith('.yaml'):
 # Define the desired order of categories
 modality_order = ["Text", "Image", "Audio", "Video"]
+level_order = ["Model", "Dataset", "Output", "Taxonomy"]
+modality_order = expand_string_list(modality_order)
+level_order = expand_string_list(level_order)
+# Convert Modality and Level columns to categorical with specified order
 globaldf['Modality'] = pd.Categorical(globaldf['Modality'], categories=modality_order, ordered=True)
+globaldf['Level'] = pd.Categorical(globaldf['Level'], categories=level_order, ordered=True)
+# Sort DataFrame by Modality and Level
+globaldf.sort_values(by=['Modality', 'Level'], inplace=True)
 # create a gradio page with tabs and accordions
 # Path: taxonomy.py
+def filter_modality_level(fulltable, modality_filter, level_filter):
+    filteredtable = fulltable[fulltable['Modality'].str.contains('|'.join(modality_filter)) & fulltable['Level'].str.contains('|'.join(level_filter))]
     return filteredtable
 def showmodal(evt: gr.SelectData):
     authormd = gr.Markdown("",visible=False)
     tagsmd = gr.Markdown("",visible=False)
     abstractmd = gr.Markdown("",visible=False)
+    modelsmd = gr.Markdown("",visible=False)
     datasetmd = gr.Markdown("",visible=False)
     gallery = gr.Gallery([],visible=False)
     if evt.index[1] == 5:
                 tagstr = ''.join(['<span class="tag">#'+tag+'</span> ' for tag in tags])
                 tagsmd = gr.Markdown(tagstr, visible=True)
+        models = itemdic['Applicable Models']
+        if isinstance(models, list):
+            if len(models) > 0:
+                modelstr = '### Applicable Models: '+''.join(['<span class="tag">'+model+'</span> ' for model in models])
+                modelsmd = gr.Markdown(modelstr, visible=True)
         titlemd = gr.Markdown('# ['+itemdic['Link']+']('+itemdic['URL']+')',visible=True)
         if pd.notnull(itemdic['Authors']):
             if len(screenshots) > 0:
                 gallery = gr.Gallery(screenshots, visible=True)
+    return [modal, titlemd, authormd, tagsmd, abstractmd, modelsmd, datasetmd, gallery]
 with gr.Blocks(title = "Social Impact Measurement V2", css=custom_css, theme=gr.themes.Base()) as demo: #theme=gr.themes.Soft(),
     # create tabs for the app, moving the current table to one titled "rewardbench" and the benchmark_text to a tab called "About"
         gr.Markdown("""
 #### Technical Base System Evaluations:
+Below we list the aspects possible to evaluate in a generative system. Context-absent evaluations only provide narrow insights into the described aspects of the level of generative AI system. The depth of literature and research on evaluations differ by modality with some modalities having sparse or no relevant literature, but the themes for evaluations can be applied to most systems.
 The following categories are high-level, non-exhaustive, and present a synthesis of the findings across different modalities. They refer solely to what can be evaluated in a base technical system:
                     """)
     with gr.Tabs(elem_classes="tab-buttons") as tabs1:
+        with gr.TabItem("Bias/Stereolevels"):
             fulltable = globaldf[globaldf['Group'] == 'BiasEvals']
+            fulltable = fulltable[['Modality','Level', 'Suggested Evaluation', 'What it is evaluating', 'Considerations', 'Link']]
             gr.Markdown("""
+            Generative AI systems can perpetuate harmful biases from various sources, including systemic, human, and statistical biases. These biases, also known as "fairness" considerations, can manifest in the final system due to choices made throughout the development process. They include harmful associations and stereolevels related to protected classes, such as race, gender, and sexuality. Evaluating biases involves assessing correlations, co-occurrences, sentiment, and toxicity across different modalities, both within the model itself and in the outputs of downstream tasks.
                         """)
             with gr.Row():
                 modality_filter = gr.CheckboxGroup(["Text", "Image", "Audio", "Video"],
                                                  show_label=True,
                                                 #  info="Which modality to show."
                                                  )
+                level_filter = gr.CheckboxGroup(["Model", "Dataset", "Output", "Taxonomy"],
                                                  value=["Model", "Dataset", "Output", "Taxonomy"],
+                                                 label="Level",
                                                  show_label=True,
                                                 #  info="Which modality to show."
                                                  )
             with gr.Row():
+                table_full = gr.DataFrame(value=fulltable, wrap=True, datatype="markdown", visible=False, interactive=False)
+                table_filtered = gr.DataFrame(value=fulltable, wrap=True, datatype="markdown", visible=True, interactive=False)
+                modality_filter.change(filter_modality_level, inputs=[table_full, modality_filter, level_filter], outputs=table_filtered)
+                level_filter.change(filter_modality_level, inputs=[table_full, modality_filter, level_filter], outputs=table_filtered)
                 with Modal(visible=False) as modal:
                     gr.Markdown("### What it is evaluating", visible=True)
                     gr.Markdown('## Resources', visible=True)
                     gr.Markdown('### What you need to do this evaluation', visible=True)
+                    modelsmd = gr.Markdown(visible=False)
                     datasetmd = gr.Markdown(visible=False)
                     gr.Markdown("## Results", visible=True)
                     gr.Markdown("### Metrics", visible=True)
                     gallery = gr.Gallery(visible=False)
+                table_filtered.select(showmodal, None, [modal, titlemd, authormd, tagsmd, abstractmd, modelsmd, datasetmd, gallery])
         with gr.TabItem("Cultural Values/Sensitive Content"):
+            fulltable = globaldf[globaldf['Group'] == 'CulturalEvals']
+            fulltable = fulltable[['Modality','Level', 'Suggested Evaluation', 'What it is evaluating', 'Considerations', 'Link']]
+            gr.Markdown("""Cultural values are specific to groups and sensitive content is normative. Sensitive topics also vary by culture and can include hate speech. What is considered a sensitive topic, such as egregious violence or adult sexual content, can vary widely by viewpoint. Due to norms differing by culture, region, and language, there is no standard for what constitutes sensitive content.
+                        Distinct cultural values present a challenge for deploying models into a global sphere, as what may be appropriate in one culture may be unsafe in others. Generative AI systems cannot be neutral or objective, nor can they encompass truly universal values. There is no “view from nowhere”; in quantifying anything, a particular frame of reference is imposed.
+                        """)
             with gr.Row():
+                modality_filter = gr.CheckboxGroup(["Text", "Image", "Audio", "Video"],
+                                                 value=["Text", "Image", "Audio", "Video"],
+                                                 label="Modality",
+                                                 show_label=True,
+                                                #  info="Which modality to show."
+                                                 )
+                level_filter = gr.CheckboxGroup(["Model", "Dataset", "Output", "Taxonomy"],
+                                                 value=["Model", "Dataset", "Output", "Taxonomy"],
+                                                 label="Level",
+                                                 show_label=True,
+                                                #  info="Which modality to show."
+                                                 )
+            with gr.Row():
+                table_full = gr.DataFrame(value=fulltable, wrap=True, datatype="markdown", visible=False, interactive=False)
+                table_filtered = gr.DataFrame(value=fulltable, wrap=True, datatype="markdown", visible=True, interactive=False)
+                modality_filter.change(filter_modality_level, inputs=[table_full, modality_filter, level_filter], outputs=table_filtered)
+                level_filter.change(filter_modality_level, inputs=[table_full, modality_filter, level_filter], outputs=table_filtered)
+                with Modal(visible=False) as modal:
+                    titlemd = gr.Markdown(visible=False)
+                    authormd = gr.Markdown(visible=False)
+                    tagsmd = gr.Markdown(visible=False)
+                    abstractmd = gr.Markdown(visible=False)
+                    modelsmd = gr.Markdown(visible=False)
+                    datasetmd = gr.Markdown(visible=False)
+                    gallery = gr.Gallery(visible=False)
+                table_filtered.select(showmodal, None, [modal, titlemd, authormd, tagsmd, abstractmd, modelsmd, datasetmd, gallery])
         # with gr.TabItem("Disparate Performance"):
         #     with gr.Row():
         #         gr.Image()
         with gr.TabItem("Privacy/Data Protection"):
+            fulltable = globaldf[globaldf['Group'] == 'PrivacyEvals']
+            fulltable = fulltable[['Modality','Level', 'Suggested Evaluation', 'What it is evaluating', 'Considerations', 'Link']]
+            gr.Markdown("""Cultural values are specific to groups and sensitive content is normative. Sensitive topics also vary by culture and can include hate speech. What is considered a sensitive topic, such as egregious violence or adult sexual content, can vary widely by viewpoint. Due to norms differing by culture, region, and language, there is no standard for what constitutes sensitive content.
+                        Distinct cultural values present a challenge for deploying models into a global sphere, as what may be appropriate in one culture may be unsafe in others. Generative AI systems cannot be neutral or objective, nor can they encompass truly universal values. There is no “view from nowhere”; in quantifying anything, a particular frame of reference is imposed.
+                        """)
             with gr.Row():
+                modality_filter = gr.CheckboxGroup(["Text", "Image", "Audio", "Video"],
+                                                 value=["Text", "Image", "Audio", "Video"],
+                                                 label="Modality",
+                                                 show_label=True,
+                                                #  info="Which modality to show."
+                                                 )
+                level_filter = gr.CheckboxGroup(["Model", "Dataset", "Output", "Taxonomy"],
+                                                 value=["Model", "Dataset", "Output", "Taxonomy"],
+                                                 label="Level",
+                                                 show_label=True,
+                                                #  info="Which modality to show."
+                                                 )
+            with gr.Row():
+                table_full = gr.DataFrame(value=fulltable, wrap=True, datatype="markdown", visible=False, interactive=False)
+                table_filtered = gr.DataFrame(value=fulltable, wrap=True, datatype="markdown", visible=True, interactive=False)
+                modality_filter.change(filter_modality_level, inputs=[table_full, modality_filter, level_filter], outputs=table_filtered)
+                level_filter.change(filter_modality_level, inputs=[table_full, modality_filter, level_filter], outputs=table_filtered)
+                with Modal(visible=False) as modal:
+                    titlemd = gr.Markdown(visible=False)
+                    authormd = gr.Markdown(visible=False)
+                    tagsmd = gr.Markdown(visible=False)
+                    abstractmd = gr.Markdown(visible=False)
+                    modelsmd = gr.Markdown(visible=False)
+                    datasetmd = gr.Markdown(visible=False)
+                    gallery = gr.Gallery(visible=False)
+                table_filtered.select(showmodal, None, [modal, titlemd, authormd, tagsmd, abstractmd, modelsmd, datasetmd, gallery])
         # with gr.TabItem("Financial Costs"):
         #     with gr.Row():

configs/crowspairs.yaml CHANGED Viewed

@@ -14,6 +14,6 @@ Screenshots:
 - Images/CrowsPairs1.png
 - Images/CrowsPairs2.png
 Suggested Evaluation: Crow-S Pairs
-Type: Dataset
 URL: https://arxiv.org/abs/2010.00133
 What it is evaluating: Protected class stereotypes

 - Images/CrowsPairs1.png
 - Images/CrowsPairs2.png
 Suggested Evaluation: Crow-S Pairs
+Level: Dataset
 URL: https://arxiv.org/abs/2010.00133
 What it is evaluating: Protected class stereotypes

configs/homoglyphbias.yaml DELETED Viewed

@@ -1,16 +0,0 @@
-Abstract: .nan
-Applicable Models: .nan
-Authors: .nan
-Considerations: .nan
-Datasets: .nan
-Group: BiasEvals
-Hashtags: .nan
-Link: Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis
-Modality: Image
-Screenshots: []
-Suggested Evaluation: Effect of different scripts on text-to-image generation
-Type: Output
-URL: https://arxiv.org/pdf/2209.08891.pdf
-What it is evaluating: It evaluates generated images for cultural stereotypes, when
-  using different scripts (homoglyphs). It somewhat measures the suceptibility of
-  a model to produce cultural stereotypes by simply switching the script

configs/honest.yaml CHANGED Viewed

@@ -11,6 +11,6 @@ Link: 'HONEST: Measuring Hurtful Sentence Completion in Language Models'
 Modality: Text
 Screenshots: []
 Suggested Evaluation: 'HONEST: Measuring Hurtful Sentence Completion in Language Models'
-Type: Output
 URL: https://aclanthology.org/2021.naacl-main.191.pdf
 What it is evaluating: Protected class stereotypes and hurtful language

 Modality: Text
 Screenshots: []
 Suggested Evaluation: 'HONEST: Measuring Hurtful Sentence Completion in Language Models'
+Level: Output
 URL: https://aclanthology.org/2021.naacl-main.191.pdf
 What it is evaluating: Protected class stereotypes and hurtful language

configs/ieat.yaml CHANGED Viewed

@@ -12,6 +12,6 @@ Link: Image Representations Learned With Unsupervised Pre-Training Contain Human
 Modality: Image
 Screenshots: []
 Suggested Evaluation: Image Embedding Association Test (iEAT)
-Type: Model
 URL: https://dl.acm.org/doi/abs/10.1145/3442188.3445932
 What it is evaluating: Embedding associations

 Modality: Image
 Screenshots: []
 Suggested Evaluation: Image Embedding Association Test (iEAT)
+Level: Model
 URL: https://dl.acm.org/doi/abs/10.1145/3442188.3445932
 What it is evaluating: Embedding associations

configs/imagedataleak.yaml CHANGED Viewed

@@ -10,6 +10,6 @@ Link: 'Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias i
 Modality: Image
 Screenshots: []
 Suggested Evaluation: Dataset leakage and model leakage
-Type: Dataset
 URL: https://arxiv.org/abs/1811.08489
 What it is evaluating: Gender and label bias

 Modality: Image
 Screenshots: []
 Suggested Evaluation: Dataset leakage and model leakage
+Level: Dataset
 URL: https://arxiv.org/abs/1811.08489
 What it is evaluating: Gender and label bias

configs/measuringforgetting.yaml ADDED Viewed

	@@ -0,0 +1,19 @@

+Abstract: "Machine learning models exhibit two seemingly contradictory phenomena: training data memorization, and various forms of forgetting. In memorization, models overfit specific training examples and become susceptible to privacy attacks. In forgetting, examples which appeared early in training are forgotten by the end. In this work, we connect these phenomena. We propose a technique to measure to what extent models \"forget\" the specifics of training examples, becoming less susceptible to privacy attacks on examples they have not seen recently. We show that, while non-convex models can memorize data forever in the worst-case, standard image, speech, and language models empirically do forget examples over time. We identify nondeterminism as a potential explanation, showing that deterministically trained models do not forget. Our results suggest that examples seen early when training with extremely large datasets - for instance those examples used to pre-train a model - may observe privacy benefits at the expense of examples seen later."
+Applicable Models:
+- ResNet (Image)
+- Conformer (Audio)
+- T5 (Text)
+Authors: Matthew Jagielski, Om Thakkar, Florian Tramèr, Daphne Ippolito, Katherine Lee, Nicholas Carlini, Eric Wallace, Shuang Song, Abhradeep Thakurta, Nicolas Papernot, Chiyuan Zhang
+Considerations: .nan
+Datasets: .nan
+Group: PrivacyEvals
+Hashtags: .nan
+Link: 'Measuring Forgetting of Memorized Training Examples'
+Modality: Text + Image + Audio
+Screenshots:
+- Images/Forgetting1.png
+- Images/Forgetting2.png
+Suggested Evaluation: Measuring forgetting of training examples
+Level: Model
+URL: https://arxiv.org/pdf/2207.00099.pdf
+What it is evaluating: Measure whether models forget training examples over time, over different types of models (image, audio, text) and how order of training affects privacy attacks

configs/notmyvoice.yaml CHANGED Viewed

@@ -11,6 +11,6 @@ Modality: Audio
 Screenshots: []
 Suggested Evaluation: Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech
   Generators
-Type: Taxonomy
 URL: https://arxiv.org/pdf/2402.01708.pdf
 What it is evaluating: Lists harms of audio/speech generators

 Screenshots: []
 Suggested Evaluation: Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech
   Generators
+Level: Taxonomy
 URL: https://arxiv.org/pdf/2402.01708.pdf
 What it is evaluating: Lists harms of audio/speech generators

configs/palms.yaml ADDED Viewed

	@@ -0,0 +1,14 @@

+Abstract: "Language models can generate harmful and biased outputs and exhibit undesirable behavior according to a given cultural context. We propose a Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets, an iterative process to significantly change model behavior by crafting and fine-tuning on a dataset that reflects a predetermined set of target values. We evaluate our process using three metrics: quantitative metrics with human evaluations that score output adherence to a target value, toxicity scoring on outputs; and qualitative metrics analyzing the most common word associated with a given social category. Through each iteration, we add additional training dataset examples based on observed shortcomings from evaluations. PALMS performs significantly better on all metrics compared to baseline and control models for a broad range of GPT-3 language model sizes without compromising capability integrity. We find that the effectiveness of PALMS increases with model size. We show that significantly adjusting language model behavior is feasible with a small, hand-curated dataset."
+Applicable Models: .nan
+Authors: Irene Solaiman, Christy Dennison
+Considerations: Requires predefining what adherence to a culture means for human evals
+Datasets: .nan
+Group: CulturalEvals
+Hashtags: .nan
+Link: 'Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets'
+Modality: Text
+Screenshots: .nan
+Suggested Evaluation: Human and Toxicity Evals of Cultural Value Categories
+Level: Output
+URL: http://arxiv.org/abs/2106.10328
+What it is evaluating: Adherence to defined norms for a set of cultural categories

configs/safelatentdiff.yaml ADDED Viewed

	@@ -0,0 +1,17 @@

+Abstract: "Text-conditioned image generation models have recently achieved astonishing results in image quality and text alignment and are consequently employed in a fast-growing number of applications. Since they are highly data-driven, relying on billion-sized datasets randomly scraped from the internet, they also suffer, as we demonstrate, from degenerated and biased human behavior. In turn, they may even reinforce such biases. To help combat these undesired side effects, we present safe latent diffusion (SLD). Specifically, to measure the inappropriate degeneration due to unfiltered and imbalanced training sets, we establish a novel image generation test bed-inappropriate image prompts (I2P)-containing dedicated, real-world image-to-text prompts covering concepts such as nudity and violence. As our exhaustive empirical evaluation demonstrates, the introduced SLD removes and suppresses inappropriate image parts during the diffusion process, with no additional training required and no adverse effect on overall image quality or text alignment."
+Applicable Models:
+- Stable Diffusion
+Authors: Patrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting
+Considerations: What is considered appropriate and inappropriate varies strongly across cultures and is very context dependent
+Datasets: https://huggingface.co/datasets/AIML-TUDA/i2p
+Group: CulturalEvals
+Hashtags: .nan
+Link: 'Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models'
+Modality: Image
+Screenshots:
+- Images/SLD1.png
+- Images/SLD2.png
+Suggested Evaluation: Evaluating text-to-image models for safety
+Level: Output
+URL: https://arxiv.org/pdf/2211.05105.pdf
+What it is evaluating: Generating images for diverse set of prompts (novel I2P benchmark) and investigating how often e.g. violent/nude images will be generated. There is a distinction between implicit and explicit safety, i.e. unsafe results with “normal” prompts.

configs/stablebias.yaml CHANGED Viewed

@@ -9,6 +9,6 @@ Link: 'Stable bias: Analyzing societal representations in diffusion models'
 Modality: Image
 Screenshots: []
 Suggested Evaluation: Characterizing the variation in generated images
-Type: Output
 URL: https://arxiv.org/abs/2303.11408
 What it is evaluating: .nan

 Modality: Image
 Screenshots: []
 Suggested Evaluation: Characterizing the variation in generated images
+Level: Output
 URL: https://arxiv.org/abs/2303.11408
 What it is evaluating: .nan

configs/stereoset.yaml DELETED Viewed

@@ -1,16 +0,0 @@
-Abstract: .nan
-Applicable Models: .nan
-Authors: .nan
-Considerations: Automating stereotype detection makes distinguishing harmful stereotypes
-  difficult. It also raises many false positives and can flag relatively neutral associations
-  based in fact (e.g. population x has a high proportion of lactose intolerant people).
-Datasets: .nan
-Group: BiasEvals
-Hashtags: .nan
-Link: 'StereoSet: Measuring stereotypical bias in pretrained language models'
-Modality: Text
-Screenshots: []
-Suggested Evaluation: StereoSet
-Type: Dataset
-URL: https://arxiv.org/abs/2004.09456
-What it is evaluating: Protected class stereotypes

configs/tango.yaml ADDED Viewed

	@@ -0,0 +1,19 @@

+Abstract: "Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB identities requires understanding how such identities uniquely interact with societal gender norms and how they differ from gender binary-centric perspectives. Such measurement frameworks inherently require centering TGNB voices to help guide the alignment between gender-inclusive NLP and whom they are intended to serve. Towards this goal, we ground our work in the TGNB community and existing interdisciplinary literature to assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation (OLG). This social knowledge serves as a guide for evaluating popular large language models (LLMs) on two key aspects: (1) misgendering and (2) harmful responses to gender disclosure. To do this, we introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community. We discover a dominance of binary gender norms reflected by the models; LLMs least misgendered subjects in generated text when triggered by prompts whose subjects used binary pronouns. Meanwhile, misgendering was most prevalent when triggering generation with singular they and neopronouns. When prompted with gender disclosures, TGNB disclosure generated the most stigmatizing language and scored most toxic, on average. Our findings warrant further research on how TGNB harms manifest in LLMs and serve as a broader case study toward concretely grounding the design of gender-inclusive AI in community voices and interdisciplinary literature."
+Applicable Models:
+- GPT-2
+- GPT-Neo
+- OPT
+Authors: Anaelia Ovalle, Palash Goyal, Jwala Dhamala, Zachary Jaggers, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta
+Considerations: Based on automatic evaluations of the resulting open language generation, may be sensitive to the choice of evaluator. Would advice for a combination of perspective, detoxify, and regard metrics
+Datasets: https://huggingface.co/datasets/AlexaAI/TANGO
+Group: CulturalEvals
+Hashtags: .nan
+Link: '“I’m fully who I am”: Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation'
+Modality: Text
+Screenshots:
+- Images/TANGO1.png
+- Images/TANGO2.png
+Suggested Evaluation: Human and Toxicity Evals of Cultural Value Categories
+Level: Output
+URL: http://arxiv.org/abs/2106.10328
+What it is evaluating: Bias measurement for trans and nonbinary community via measuring gender non-affirmative language, specifically 1) misgendering 2), negative responses to gender disclosure

configs/videodiversemisinfo.yaml CHANGED Viewed

@@ -13,7 +13,7 @@ Modality: Video
 Screenshots: []
 Suggested Evaluation: 'Diverse Misinformation: Impacts of Human Biases on Detection
   of Deepfakes on Networks'
-Type: Output
 URL: https://arxiv.org/abs/2210.10026
 What it is evaluating: Human led evaluations of deepfakes to understand susceptibility
   and representational harms (including political violence)

 Screenshots: []
 Suggested Evaluation: 'Diverse Misinformation: Impacts of Human Biases on Detection
   of Deepfakes on Networks'
+Level: Output
 URL: https://arxiv.org/abs/2210.10026
 What it is evaluating: Human led evaluations of deepfakes to understand susceptibility
   and representational harms (including political violence)

configs/weat.yaml CHANGED Viewed

@@ -36,7 +36,7 @@ Screenshots:
 - Images/WEAT1.png
 - Images/WEAT2.png
 Suggested Evaluation: Word Embedding Association Test (WEAT)
-Type: Model
 URL: https://researchportal.bath.ac.uk/en/publications/semantics-derived-automatically-from-language-corpora-necessarily
 What it is evaluating: Associations and word embeddings based on Implicit Associations
   Test (IAT)

 - Images/WEAT1.png
 - Images/WEAT2.png
 Suggested Evaluation: Word Embedding Association Test (WEAT)
+Level: Model
 URL: https://researchportal.bath.ac.uk/en/publications/semantics-derived-automatically-from-language-corpora-necessarily
 What it is evaluating: Associations and word embeddings based on Implicit Associations
   Test (IAT)