Massively Multilingual Speech (MMS) - 300m
Facebook's MMS counting 300m parameters.
MMS is Facebook AI's massive multilingual pretrained model for speech ("MMS"). It is pretrained in with Wav2Vec2's self-supervised training objective on about 500,000 hours of speech data in over 1,400 languages.
When using the model make sure that your speech input is sampled at 16kHz.
Note: This model should be fine-tuned on a downstream task, like Automatic Speech Recognition, Translation, or Classification. Check out the **How-to-fine section or this blog for more information about ASR.
Table Of Content
How to finetune
Coming soon...
Model details
Developed by: Vineel Pratap et al.
Model type: Multi-Lingual Automatic Speech Recognition model
Language(s): 1000+ languages
License: CC-BY-NC 4.0 license
Num parameters: 300 million
Cite as:
@article{pratap2023mms, title={Scaling Speech Technology to 1,000+ Languages}, author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli}, journal={arXiv}, year={2023} }
Additional Links
- Blog post
- Transformers documentation.
- Paper
- GitHub Repository
- Other MMS checkpoints
- MMS ASR fine-tuned checkpoints:
- Official Space
- Downloads last month
- 256,163