The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Paper • 2303.03915 • Published Mar 7, 2023 • 6
SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval) Paper • 2304.06845 • Published Apr 13, 2023
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages Paper • 2305.06897 • Published May 11, 2023 • 8
MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages Paper • 2305.13989 • Published May 23, 2023
AfriMTE and AfriCOMET: Empowering COMET to Embrace Under-resourced African Languages Paper • 2311.09828 • Published Nov 16, 2023 • 1
The Effect of Domain and Diacritics in Yorùbá-English Neural Machine Translation Paper • 2103.08647 • Published Mar 15, 2021
MasakhaNER: Named Entity Recognition for African Languages Paper • 2103.11811 • Published Mar 22, 2021
NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis Paper • 2201.08277 • Published Jan 20, 2022
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects Paper • 2309.07445 • Published Sep 14, 2023
Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning Paper • 2204.06487 • Published Apr 13, 2022
A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation Paper • 2205.02022 • Published May 4, 2022
BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus Paper • 2207.03546 • Published Jul 7, 2022
MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition Paper • 2210.12391 • Published Oct 22, 2022
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting Paper • 2212.09535 • Published Dec 19, 2022 • 1
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark Paper • 2406.05967 • Published Jun 10 • 5
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources Paper • 2406.16746 • Published Jun 24
AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages Paper • 2302.08956 • Published Feb 17, 2023
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Paper • 2410.12705 • Published Oct 16 • 29