nairaxo commited on
Commit
3c35ca7
1 Parent(s): cfb4e67

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -15,7 +15,7 @@ datasets:
15
 
16
  # Vocoder with HiFIGAN trained on LJSpeech
17
 
18
- This repository provides all the necessary tools for using a [HiFIGAN](https://arxiv.org/abs/2010.05646) vocoder trained with [LJSpeech](https://keithito.com/LJ-Speech-Dataset/).
19
 
20
  The pre-trained model takes in input a spectrogram and produces a waveform in output. Typically, a vocoder is used after a TTS model that converts an input text into a spectrogram.
21
 
@@ -46,17 +46,17 @@ from speechbrain.pretrained import Tacotron2
46
  from speechbrain.pretrained import HIFIGAN
47
 
48
  # Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
49
- tacotron2 = Tacotron2.from_hparams(source="speechbrain/tts-tacotron2-ljspeech", savedir="tmpdir_tts")
50
- hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
51
 
52
  # Running the TTS
53
- mel_output, mel_length, alignment = tacotron2.encode_text("Mary had a little lamb")
54
 
55
  # Running Vocoder (spectrogram-to-waveform)
56
  waveforms = hifi_gan.decode_batch(mel_output)
57
 
58
  # Save the waverform
59
- torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)
60
  ```
61
 
62
  ### Inference on GPU
 
15
 
16
  # Vocoder with HiFIGAN trained on LJSpeech
17
 
18
+ This repository provides all the necessary tools for using a [ALLFA Public](https://github.com/getalp/ALFFA_PUBLIC/tree/master/ASR/SWAHILI).
19
 
20
  The pre-trained model takes in input a spectrogram and produces a waveform in output. Typically, a vocoder is used after a TTS model that converts an input text into a spectrogram.
21
 
 
46
  from speechbrain.pretrained import HIFIGAN
47
 
48
  # Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
49
+ tacotron2 = Tacotron2.from_hparams(source="aioxlabs/tacotron-swahili", savedir="tmpdir_tts")
50
+ hifi_gan = HIFIGAN.from_hparams(source="aioxlabs/hifigan-swahili", savedir="tmpdir_vocoder")
51
 
52
  # Running the TTS
53
+ mel_output, mel_length, alignment = tacotron2.encode_text("raisi wa jumhuri ya tanzania")
54
 
55
  # Running Vocoder (spectrogram-to-waveform)
56
  waveforms = hifi_gan.decode_batch(mel_output)
57
 
58
  # Save the waverform
59
+ torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 16000)
60
  ```
61
 
62
  ### Inference on GPU