Should this be used for LJSpeech or Style conversion?

by niranjanakella - opened Jul 4, 2024

Jul 4, 2024

@Bluebomber182 Hello I am currently testing this model and am not sure where to use it. Either I should load it as part of LJspeech where I insert some noise to generate audio or should I use it as part of Librispeech where I give a voice style sample for style mimicking. Kindly confirm and please kindly provide the config.yml for this.

Bluebomber182

Owner Jul 4, 2024

•

edited Jul 4, 2024

@niranjanakella
Are you using the Inference_LibriTTS.ipynb file via jupter notebook? If so, use the StyleTTS2-LibriTTS config.yml from this link.
https://huggingface.co/yl4579/StyleTTS2-LibriTTS/tree/main/Models/LibriTTS
Then open the Inference_LibriTTS.ipynb file

jupyter notebook Inference_LibriTTS.ipynb

Add the location of the StyleTTS2-LibriTTS config.yml file

Add the location of the pth file

Add the location of the reference audio

niranjanakella

Jul 5, 2024

@Bluebomber182 Given the bottle neck of the current model of 512 tokens, is there any implementation to handle long formed sentences.

Bluebomber182

Owner Jul 5, 2024

@niranjanakella
No.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment