|
--- |
|
tags: |
|
- espnet |
|
- audio |
|
- automatic-speech-recognition |
|
language: et |
|
license: cc-by-4.0 |
|
--- |
|
|
|
# Estonian Espnet2 ASR model |
|
|
|
## Model description |
|
This is a general-purpose Estonian ASR model trained in the Lab of Language Technology at TalTech. |
|
|
|
## Intended uses & limitations |
|
|
|
This model is intended for general-purpose speech recognition, such as broadcast conversations, interviews, talks, etc. |
|
|
|
|
|
## How to use |
|
```python |
|
|
|
from espnet2.bin.asr_inference import Speech2Text |
|
|
|
model = Speech2Text.from_pretrained( |
|
"TalTechNLP/espnet2_estonian" |
|
) |
|
|
|
speech, rate = soundfile.read("speech.wav") |
|
text, *_ = model(speech) |
|
``` |
|
|
|
#### Limitations and bias |
|
|
|
## Training data |
|
|
|
## Training procedure |
|
|
|
|
|
## Evaluation results |
|
|
|
|
|
|
|
### BibTeX entry and citation info |
|
|
|
|
|
#### Citing ESPnet |
|
```BibTex |
|
@inproceedings{watanabe2018espnet, |
|
author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson {Enrique Yalta Soplin} and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai}, |
|
title={{ESPnet}: End-to-End Speech Processing Toolkit}, |
|
year={2018}, |
|
booktitle={Proceedings of Interspeech}, |
|
pages={2207--2211}, |
|
doi={10.21437/Interspeech.2018-1456}, |
|
url={http://dx.doi.org/10.21437/Interspeech.2018-1456} |
|
} |
|
@inproceedings{hayashi2020espnet, |
|
title={{Espnet-TTS}: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit}, |
|
author={Hayashi, Tomoki and Yamamoto, Ryuichi and Inoue, Katsuki and Yoshimura, Takenori and Watanabe, Shinji and Toda, Tomoki and Takeda, Kazuya and Zhang, Yu and Tan, Xu}, |
|
booktitle={Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, |
|
pages={7654--7658}, |
|
year={2020}, |
|
organization={IEEE} |
|
} |
|
``` |