bene-ges
/

tts_ru_ipa_fastpitch_ruslan

Model card Files Files and versions Community

bene-ges commited on Apr 18, 2023

Commit

7784ed2

•

1 Parent(s): 9b02d29

Update README.md

Files changed (1) hide show

README.md +25 -0

README.md CHANGED Viewed

@@ -1,3 +1,28 @@
 ---
 license: cc-by-4.0
 ---

 ---
 license: cc-by-4.0
+language:
+- ru
+library_name: nemo
 ---
+### Input
+This model expects text converted to IPA-like transcriptions. See this [g2p model](https://huggingface.co/bene-ges/ru_g2p_ipa_bert_large) for conversion of plain text to phonemes.
+If you feed plain text directly, it will work, but quality will be low.
+### Output
+This model generates mel spectrograms.
+## Training
+The NeMo toolkit [1] was used for training the model for 1000+ epochs.
+### Datasets
+This model is trained on [RUSLAN](https://ruslan-corpus.github.io/) [2] corpus sampled at 22050Hz.
+## References
+- [1] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
+- [2] Gabdrakhmanov L., Garaev R., Razinkov E. (2019) RUSLAN: Russian Spoken Language Corpus for Speech Synthesis. In: Salah A., Karpov A., Potapova R. (eds) Speech and Computer. SPECOM 2019. Lecture Notes in Computer Science, vol 11658. Springer, Cham