Text-to-Speech
NeMo
Thai
lunarlist wannaphong commited on
Commit
d14156f
1 Parent(s): 8db4519

Update README.md (#1)

Browse files

- Update README.md (7801708cf9d9519c6504aa4c3845df366c6a081b)


Co-authored-by: Wannaphong Phatthiyaphaibun <wannaphong@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +30 -0
README.md CHANGED
@@ -1,3 +1,33 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ datasets:
4
+ - lunarlist/edited_common_voice
5
+ language:
6
+ - th
7
+ library_name: nemo
8
+ pipeline_tag: text-to-speech
9
  ---
10
+
11
+ This model is a Thai TTS model that use a voice from [Common Voice dataset](https://commonvoice.mozilla.org/) and modify the voice to not to sound like the original.
12
+
13
+ > pip install nemo_toolkit['tts'] soundfile
14
+
15
+ ```python
16
+ from nemo.collections.tts.models import UnivNetModel
17
+ from nemo.collections.tts.models import Tacotron2Model
18
+ import torch
19
+ import soundfile as sf
20
+
21
+ model = Tacotron2Model.from_pretrained("lunarlist/tts-thai").to('cpu')
22
+ vcoder_model = UnivNetModel.from_pretrained(model_name="tts_en_libritts_univnet")
23
+ text='ภาษาไทย ง่าย นิด เดียว'
24
+ dict_idx={k:i for i,k in enumerate(model.hparams["cfg"]['labels'])}
25
+ parsed2=torch.Tensor([[66]+[dict_idx[i] for i in text if i]+[67]]).int().to("cpu")
26
+ spectrogram2 = model.generate_spectrogram(tokens=parsed2)
27
+ audio2 = vcoder_model.convert_spectrogram_to_audio(spec=spectrogram2)
28
+
29
+ # Save the audio to disk in a file called speech.wav
30
+ sf.write("speech.wav", audio2.to('cpu').detach().numpy()[0], 22050)
31
+ ```
32
+
33
+ Medium: [Text-To-Speech ภาษาไทยด้วย Tacotron2](https://medium.com/@taetiyateachamatavorn/text-to-speech-%E0%B8%A0%E0%B8%B2%E0%B8%A9%E0%B8%B2%E0%B9%84%E0%B8%97%E0%B8%A2%E0%B8%94%E0%B9%89%E0%B8%A7%E0%B8%A2-tacotron2-986417b44edc)