Spaces:

hf-task-exploration
/

ExploreACMnaacl

Runtime error

Yacine Jernite

initial commit

7bffaaf over 2 years ago

7.11 kB

	import streamlit as st

	title = "Hate Speech in ACM"
	description = "The history and development of hate speech detection as a modeling task"
	date = "2022-01-26"
	thumbnail = "images/prohibited.png"


	__ACM_SECTION = """
	Content moderation is a collection of interventions used by online platforms to partially obscure
	or remove entirely from user-facing view content that is objectionable based on the company's values
	or community guidelines, which vary from platform to platform.
	[Sarah T. Roberts (2014)](https://yalebooks.yale.edu/book/9780300261479/behind-the-screen/) describes
	content moderation as "the organized practice of screening user-generated content (UGC)
	posted to Internet sites, social media, and other online outlets" (p. 12).
	[Tarleton Gillespie (2021)](https://yalebooks.yale.edu/book/9780300261431/custodians-internet/) writes
	that platforms moderate content "both to protect one user from another,
	or one group from its antagonists, and to remove the offensive, vile, or illegal.''
	While there are a variety of approaches to this problem, in this tool, we focus on automated content moderation,
	which is the application of algorithms to the classification of problematic content.

	Content that is subject to moderation can be user-directed (e.g. targeted harassment of a particular user
	in comments or direct messages) or posted to a personal account (e.g. user-created posts that contain hateful
	remarks against a particular social group).
	"""

	__CURRENT_APPROACHES = """
	Automated content moderation has relied both on analysis of the media itself (e.g. using methods from natural
	language processing and computer vision) as well as user dynamics (e.g. whether the user sending the content
	to another user shares followers with the recipient, or whether the user posting the content is a relatively new account).
	Often, the ACM pipeline is fed by user-reported content. Within the realm of text-based ACM, approaches vary
	from wordlist-based approaches to data-driven, machine learning models. Common datasets used for training and
	evaluating hate speech detectors can be found at [https://hatespeechdata.com/](https://hatespeechdata.com/).
	"""

	__CURRENT_CHALLENGES = """
	Combating hateful content on the Internet continues to be a challenge. A 2021 survey of respondents
	in the United States, conducted by Anti-Defamation League, found an increase in online hate & harassment
	directed at LGBTQ+, Asian American, Jewish, and African American individuals.

	### Technical challenges for data-driven systems

	With respect to models that are based on training data, datasets encode worldviews, and so a common challenge
	lies in having insufficient data or data that only reflects a limited worldview. For example, a recent
	study found that Tweets posted by drag queens were more often rated by an automated system as toxic than
	Tweets posted by white supremacists.
	This may be due, in part, to the labeling schemes and choices made for the data used in training the model,
	as well as particular company policies that are invoked when making these labeling choices.
	(This all needs to be spelled out better!)

	### Context matters for content moderation.

	Counterspeech is "any direct response to hateful or harmful speech which seeks to undermine it"
	(from [Dangerous Speech Project](https://dangerousspeech.org/counterspeech/)). Counterspeech has been shown
	to be an important community self-moderation tool for reducing instances of hate speech (see
	[Hangartner et al. 2021](https://www.pnas.org/doi/10.1073/pnas.2116310118)), but counterspeech is often
	incorrectly categorized as hate speech by automatic systems due to the counterspeech making direct reference
	to or quoting the original hate speech. Such system behavior silences those who are trying to push back against
	hateful and toxis speech, and, if the flagged content is hidden automatically, prevents others from seeing the
	counterspeech.

	See [van Aken et al. 2018](https://aclanthology.org/W18-5105.pdf) for a detailed list of examples that
	automatic systems frequently misclassify.

	"""

	__SELF_EXAMPLES = """
	- [(FB)(TOU) - Facebook Community Standards](https://transparency.fb.com/policies/community-standards/)
	- [(FB)(Blog) - What is Hate Speech? (2017)](https://about.fb.com/news/2017/06/hard-questions-hate-speech/)
	- [(NYT)(Blog) - * New York Times on their partnership with JigSaw*](https://open.nytimes.com/to-apply-machine-learning-responsibly-we-use-it-in-moderation-d001f49e0644)
	- [(NYT)(FAQ) - New York Times on their moderation policy](https://help.nytimes.com/hc/en-us/articles/115014792387-Comments)
	- [(Reddit)(TOU) - Reddit General Content Policies](https://www.redditinc.com/policies/content-policy)
	- [(Reddit)(Blog) - AutoMod - help scale moderation without ML](https://mods.reddithelp.com/hc/en-us/articles/360008425592-Moderation-Tools-overview)
	- [(Google)(Blog) - Google Search Results Moderation](https://blog.google/products/search/when-and-why-we-remove-content-google-search-results/)
	- [(Google)(Blog) - JigSaw Case Studies](https://www.perspectiveapi.com/case-studies/)
	- [(YouTube)(TOU) - YouTube Community Guidelines](https://www.youtube.com/howyoutubeworks/policies/community-guidelines/)
	"""

	__CRITIC_EXAMPLES = """
	- [Social Media and Extremism - Questions about January 6th 2021](https://thehill.com/policy/technology/589651-jan-6-panel-subpoenas-facebook-twitter-reddit-and-alphabet/)
	- [Over-Moderation of LGBTQ content on YouTube](https://www.gaystarnews.com/article/youtube-lgbti-content/)
	- [Disparate Impacts of Moderation](https://www.aclu.org/news/free-speech/time-and-again-social-media-giants-get-content-moderation-wrong-silencing-speech-about-al-aqsa-mosque-is-just-the-latest-example/)
	- [Calls for Transparency](https://santaclaraprinciples.org/)
	- [Income Loss from Failures of Moderation](https://foundation.mozilla.org/de/blog/facebook-delivers-a-serious-blow-to-tunisias-music-scene/)
	- [Fighting Hate Speech, Silencing Drag Queens?](https://link.springer.com/article/10.1007/s12119-020-09790-w)
	- [Reddit Self Reflection on Lack of Content Policy](https://www.reddit.com/r/announcements/comments/gxas21/upcoming_changes_to_our_content_policy_our_board/)
	"""

	def run_article():
	st.markdown("## Automatic Content Moderation (ACM)")
	with st.expander("ACM definition", expanded=False):
	st.markdown(__ACM_SECTION, unsafe_allow_html=True)
	st.markdown("## Current approaches to ACM")
	with st.expander("Current Approaches"):
	st.markdown(__CURRENT_APPROACHES, unsafe_allow_html=True)
	st.markdown("## Current challenges in ACM")
	with st.expander("Current Challenges"):
	st.markdown(__CURRENT_CHALLENGES, unsafe_allow_html=True)
	st.markdown("## Examples of ACM in Use: in the Press and in their own Words")
	col1, col2 = st.columns([4, 5])
	with col1.expander("In their own Words"):
	st.markdown(__SELF_EXAMPLES, unsafe_allow_html=True)
	with col2.expander("Critical Writings"):
	st.markdown(__CRITIC_EXAMPLES, unsafe_allow_html=True)