pretraining dataset is Libri-Light, not LibriSpeech

#2
by gaunernst - opened

As per paper (https://arxiv.org/pdf/2202.03555), table 2, data2vec audio large was pre-trained on Libri-Light.
GitHub page (https://github.com/facebookresearch/fairseq/tree/main/examples/data2vec) also shows that large variants were pre-trained on Libri-Light.

Datasets tag should be updated accordingly.

Sign up or log in to comment