amandakonet's picture
update
1ac8f2a
raw
history blame
7.38 kB
from transformers import pipeline
import streamlit as st
import pandas as pd
from PIL import Image
import os
# title
st.title('Combatting Climate Change Misinformation with Transformers')
st.markdown("## The Gist")
st.markdown("**Problem**πŸ€”: Climate change misinformation spreads quickly and is difficult to combat. However, its important to do so, because climate change misinformation has direct impacts on public opinion and public policy surrounding climate change.")
st.markdown("**Solution**πŸ’‘: Develop a pipeline in which users can input climate change claims... and the pipeline returns whether the claim is refuted or supported by current climate science, along with the corresponding evidence.")
st.markdown("**Approach**πŸ”‘")
st.markdown("* There are many steps to this pipeline. Here, I focus on fine-tuning a transformer model, ClimateBERT, using the textual entailment task.")
st.markdown("* The dataset used is Climate FEVER, a natural language inference dataset with 1,579 {claim, [evidence], [label]} tuples")
st.markdown("* Given a {claim, evidence} pair, determine whether the climate claim is supported or refuted (or neither) by the evidence")
st.markdown("---")
st.markdown("## The Details")
# section 1: the context, problem; how to address
st.markdown("### Problem πŸ€”")
st.markdown("Misinformation about climate change spreads quickly and has direct impacts on public opinion and public policy surrounding the climate. Further, misinformation is difficult to combat, and people are able to \"verify\" false climate claims on biased sites. Ideally, people would be able to easily verify climate claims. This is where transformers come in.")
# section 2: what is misinformation? how is it combatted now? how successful is this?
st.markdown("### More about Misinformation")
st.markdown("What is misinformation? How does it spread?")
st.markdown("* **Misinformation** can be defined as β€œfalse or inaccurate information, especially that which is deliberately intended to deceive.”")
st.markdown("* It can exist in different domains, and each domain has different creators and distributors of misinformation.")
st.markdown("* Misinformation regarding climate change is often funded by conservative foundations or large energy industries such as gas, coal, and oil. (1)")
misinfo_flowchart = Image.open('images/misinfo_chart.jpeg')
st.image(misinfo_flowchart, caption='The misinformation flowchart. (1)')
st.markdown("**Why does this matter?** Through echo chambers, polarization, and feedback loops, misinformation can spread from these large organizes to the public, thus arming the public with pursausive information designed to create scepticism around and/or denial of climate change, its urgency, and climate change scientists. This is especially problematic in democratic societies, where the public, to some extent, influences governmental policy decisions (brookings). Existing research suggests that misinformation directly contributes to public support of political inaction and active stalling or rejection of pro- climate change policies (1).")
st.markdown("How is climate change misinformation combatted now? Below are a few of the ways according to the Brookings Institute:")
st.markdown("1. Asking news sources to call out misinformation")
st.markdown("2. Teaching and encouraging media literacy among the public (how to detect fake news, critical evaluation of information provided, etc.")
st.markdown("3. Governments should encourage independent journalism but avoid censoring news")
st.markdown("4. Social media platform investment in algorithmic detection of fake news")
st.markdown("However, many of the proposed solutions above require adoption of behaviors. This is difficult to acheive, particularly among news organizations and social media platforms which receive monetary benefits from misinformation in the form of ad revenue from cite usage and viewership.")
# section 3: how can transformers help?
st.markdown("### How can Transformers Help?πŸ’‘")
st.markdown("**FEVER**")
st.markdown("* FEVER, or Fact Extraction and VERification, was introduced in 2018 as the first dataset containing {fact, evdience, entailment_label} information. They extracted altering sentences from Wikipedia and had annotators report the relationship between the setences: entailment, contradition, not enough information.")
st.markdown("* Since then, other researchers have expanded on this area in different domains")
st.markdown("* Here, we use $Climate FEVER^3$, a similar dataset developed and annotated by ")
st.markdown("**Fact Verification / Fact-Checking")
st.markdown("* This is simply an extenstion of the textual entailment task")
st.markdown("* Given two sentences, sent1 and sent2, determine the relationship: entail, contradict, neutral")
st.markdown("* With fact verification, we can think of the sentences as claim and evidence and labels as support, refute, or not enough information to refute or support.")
# section 4: The process
# this is the pipeline in my notes (u are here highlight)
st.markdown("### The Process πŸ”‘")
st.markdown("Imagine: A person is curious about whether a claim they heard about climate change is true. How can transformers help validate or refute the claim?")
st.markdown("1. User inputs a climate claim")
st.markdown("2. Retrieve evidence related to input claim \
- For each claim, collect N related documents. These documents are selected by finding the N documents with the highest similarity scores to the claim.")
st.markdown("- Current area of research: How do we keep the set of curated documents up-to-date? Validate their contents?")
st.markdown("3. Send (claim, evidence) pairs to a transformer model. Have the model predict whether each evidence supports, refutes, or is not relevant to the claim. (πŸ“ YOU ARE HERE!)")
st.markdown("4. Report back to the user: The supporting evidence for the claim (if any), the refuting evidence for the claim (if any). If no relevant evidence is found, report that the claim cannot be supported or refuted by current evidence.")
# section 5: my work
st.markdown("### Climate Claim Fact-Checking with Transformers")
st.markdown("My work focuses on step 3 of the process: Training a transformer model to accurately categorize (claim, evidence) as:")
st.markdown("* evidence *supports* (entails) claim")
st.markdown("* evidence *refutes* (contradicts) claim")
st.markdown("* evidence *does not provide enough info to support or refute* (neutral) claim")
st.markdown("For this project, I fine-tune $ClimateBERT_^4$ on the text entailment task.")
# section 6: analysis
# section 7: conclusion
# References + Resource Links
st.markdown("### Resource Links")
# climatefever paper
# feverpaper
# fact checking covid paper
# models
# nli fine-tuning notebook
st.markdown("### References")
st.markdown("1. https://www.carbonbrief.org/guest-post-how-climate-change-misinformation-spreads-online")
st.markdown("2. https://www.brookings.edu/research/how-to-combat-fake-news-and-disinformation/")
st.markdown("3. Climate FEVER [paper](https://arxiv.org/abs/2012.00614), [huggingface repo](https://huggingface.co/datasets/climate_fever), and [github](https://github.com/huggingface/datasets/tree/master/datasets/climate_fever)")
st.markdown("4. [ClimateBERT](https://climatebert.ai/), [paper](https://arxiv.org/abs/2110.12010))