espnet
/

WavLabLM-MS-40k

self-supervised-learning

speech-recognition

Model card Files Files and versions Community

wanchichen commited on Oct 2, 2023

Commit

6b2f84f

·

1 Parent(s): bcbd17d

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -14,7 +14,11 @@ license: cc-by-4.0
 ## WavLabLM-MS 40k
-This model was trained by [William Chen] using ESPNet2's SSL recipe in [espnet](https://github.com/espnet/espnet/).

 ## WavLabLM-MS 40k
+[Paper](https://arxiv.org/abs/2309.15317)
+This model was trained by [William Chen](https://wanchichen.github.io/) using ESPNet2's SSL recipe in [espnet](https://github.com/espnet/espnet/).
+WavLabLM is an self-supervised audio encoder pre-trained on 40,000 hours of multilingual data across 136 languages. This specific variant, WavLabLM-MS, went through a second stage of pre-training on a balanced subset of the data to improve performance on lower-resource languages.
+It achieves comparable performance to XLS-R 128 on the [ML-SUPERB Benchmark](https://arxiv.org/abs/2305.10615) with only 10% of the pre-training data.