Guan-Ting
/

StyleSpeech-MelGAN-vocoder-16kHz

Model card Files Files and versions Community

Guan-Ting commited on Dec 17, 2021

Commit

05a0076

•

1 Parent(s): 2b9c352

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -7,6 +7,9 @@
 * StyleSpeech is based on 16k Hz sampling rate, and there is no available 16k Hz multi-speaker vocoder.
 * Thus I train this vocoder from scratch using Libri-TTS train-100 hour dataset. The training pipeline is the same as the official MelGAN (https://github.com/descriptinc/melgan-neurips).
 * The synthesized sounds are close to the official demo with good quality.
 #### Training Details
 * GPU: RTX 2080Ti
 * Training epoch: 3000

 * StyleSpeech is based on 16k Hz sampling rate, and there is no available 16k Hz multi-speaker vocoder.
 * Thus I train this vocoder from scratch using Libri-TTS train-100 hour dataset. The training pipeline is the same as the official MelGAN (https://github.com/descriptinc/melgan-neurips).
 * The synthesized sounds are close to the official demo with good quality.
+#### Usage
+* Please follow the official MelGAN (https://github.com/descriptinc/melgan-neurips) to load pre-trained checkpoint and convert your mel-spectrogram back to the waveform.
 #### Training Details
 * GPU: RTX 2080Ti
 * Training epoch: 3000