OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 47
Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking Paper • 2106.06052 • Published May 21, 2021
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 47
Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language Paper • 2306.16410 • Published Jun 28, 2023 • 28
What Language Model to Train if You Have One Million GPU Hours? Paper • 2210.15424 • Published Oct 27, 2022 • 2
Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants Paper • 2112.09062 • Published Dec 16, 2021
Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks Paper • 2204.01906 • Published Apr 5, 2022
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality Paper • 2204.03162 • Published Apr 7, 2022 • 1
Investigating Multi-source Active Learning for Natural Language Inference Paper • 2302.06976 • Published Feb 14, 2023
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Paper • 2005.11401 • Published May 22, 2020 • 11
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Paper • 2303.03915 • Published Mar 7, 2023 • 6
Crosslingual Generalization through Multitask Finetuning Paper • 2211.01786 • Published Nov 3, 2022 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 27
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 27