Text-to-Speech
NeMo
Thai

This model is a Thai TTS model that use a voice from Common Voice dataset and modify the voice to not to sound like the original.

pip install nemo_toolkit['tts'] soundfile

from nemo.collections.tts.models import UnivNetModel
from nemo.collections.tts.models import Tacotron2Model
import torch
import soundfile as sf

model = Tacotron2Model.from_pretrained("lunarlist/tts-thai").to('cpu')
vcoder_model = UnivNetModel.from_pretrained(model_name="tts_en_libritts_univnet")
text='ภาษาไทย ง่าย นิด เดียว'
dict_idx={k:i for i,k in enumerate(model.hparams["cfg"]['labels'])}
parsed2=torch.Tensor([[66]+[dict_idx[i] for i in text if i]+[67]]).int().to("cpu")
spectrogram2 = model.generate_spectrogram(tokens=parsed2)
audio2 = vcoder_model.convert_spectrogram_to_audio(spec=spectrogram2)

# Save the audio to disk in a file called speech.wav
sf.write("speech.wav", audio2.to('cpu').detach().numpy()[0], 22050)

Medium: Text-To-Speech ภาษาไทยด้วย Tacotron2

Downloads last month
168
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-to-speech models for nemo library.

Dataset used to train lunarlist/tts-thai