How many data
Hi, i see you replace old vocab with new russian IPA vocab. How many data to training this model, thank you
Hello. The library contains one statistical model (for generating IPA transcription) and two bert models for accentuation. To train the statistical model, I used mainly words from wiktionary and wikipedia, (the Russian version of which contains an IPA transcription). To train bert, I used ~3 GB of text data, for which the correct accents were placed for ambiguous words. I am currently working on increasing the amount of training data in order to more accurately resolve ambiguities in the accentuation.
Thank you for your reply. What I mean is how much audio data do you use for training xtts
I'm sorry, I should have guessed. It was a small experiment, just to understand how it makes sense to use transcription and accents for speech synthesis. I used ~60 hours of speech for training. In the README, I referred to the acoustic data that I used for training. https://github.com/omogr/omogre/blob/main/README_eng.md. The model was trained on the RUSLAN and Common Voice datasets.
https://ruslan-corpus.github.io/
https://commonvoice.mozilla.org/ru
Thank you