Papers
arxiv:2306.15162
YouTube-ASL: A Large-Scale, Open-Domain American Sign Language-English Parallel Corpus
Published on Jun 27, 2023
Authors:
Abstract
Machine learning for sign languages is bottlenecked by data. In this paper, we present YouTube-ASL, a large-scale, open-domain corpus of American Sign Language (ASL) videos and accompanying English captions drawn from YouTube. With ~1000 hours of videos and >2500 unique signers, YouTube-ASL is ~3x as large and has ~10x as many unique signers as the largest prior ASL dataset. We train baseline models for ASL to English translation on YouTube-ASL and evaluate them on How2Sign, where we achieve a new finetuned state of the art of 12.39 BLEU and, for the first time, report zero-shot results.
Models citing this paper 0
No model linking this paper
Cite arxiv.org/abs/2306.15162 in a model README.md to link it from this page.
Datasets citing this paper 1
Spaces citing this paper 1
Collections including this paper 0
No Collection including this paper
Add this paper to a
collection
to link it from this page.