--- license: mit datasets: - ryota-komatsu/libritts-r-mhubert-2000units language: - en base_model: - ryota-komatsu/fastspeech2_conformer_hifigan --- [Conditional Flow Matching-based acoustic model](https://arxiv.org/abs/2306.15687) with a [HiFi-GAN](https://arxiv.org/abs/2010.05646) vocoder. This is a model repository of [a GitHub project](https://github.com/ryota-komatsu/speech_resynth). The model was trained on 16 kHz downsampled [LibriTTS-R](https://arxiv.org/abs/2305.18802) and [EXPRESSO](https://arxiv.org/abs/2308.05725) HuBERT units.