LydiaSanyu commited on
Commit
10564d2
1 Parent(s): c7614e5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -1
README.md CHANGED
@@ -13,4 +13,77 @@ metrics:
13
  - mos
14
  ---
15
 
16
- Sunbird Luganda Text-To-Speech translation model trained data crowdsourced on CommonVoice
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  - mos
14
  ---
15
 
16
+ # Sunbird AI Text-to-Speech (TTS) model trained on Luganda text
17
+
18
+ ## Text-to-Speech (TTS) with Tacotron2 trained on LJSpeech
19
+
20
+ This repository provides all the necessary tools for Text-to-Speech (TTS) with SpeechBrain using a [Tacotron2](https://arxiv.org/abs/1712.05884) pretrained on [LJSpeech](https://keithito.com/LJ-Speech-Dataset/).
21
+
22
+ The pre-trained model takes in input a short text and produces a spectrogram in output. One can get the final waveform by applying a vocoder (e.g., HiFIGAN) on top of the generated spectrogram.
23
+
24
+
25
+ ## Install SpeechBrain
26
+
27
+ ```
28
+ pip install speechbrain
29
+ ```
30
+
31
+ Please notice that we encourage you to read our tutorials and learn more about
32
+ [SpeechBrain](https://speechbrain.github.io).
33
+
34
+ ### Perform Text-to-Speech (TTS)
35
+
36
+ ```
37
+ import torchaudio
38
+ from speechbrain.pretrained import Tacotron2
39
+ from speechbrain.pretrained import HIFIGAN
40
+
41
+ # Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
42
+ tacotron2 = Tacotron2.from_hparams(source="speechbrain/tts-tacotron2-ljspeech", savedir="tmpdir_tts")
43
+ hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
44
+
45
+ # Running the TTS
46
+ mel_output, mel_length, alignment = tacotron2.encode_text("Mary had a little lamb")
47
+
48
+ # Running Vocoder (spectrogram-to-waveform)
49
+ waveforms = hifi_gan.decode_batch(mel_output)
50
+
51
+ # Save the waverform
52
+ torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)
53
+ ```
54
+
55
+ If you want to generate multiple sentences in one-shot, you can do in this way:
56
+
57
+ ```
58
+ from speechbrain.pretrained import Tacotron2
59
+ tacotron2 = Tacotron2.from_hparams(source="speechbrain/TTS_Tacotron2", savedir="tmpdir")
60
+ items = [
61
+ "A quick brown fox jumped over the lazy dog",
62
+ "How much wood would a woodchuck chuck?",
63
+ "Never odd or even"
64
+ ]
65
+ mel_outputs, mel_lengths, alignments = tacotron2.encode_batch(items)
66
+
67
+ ```
68
+
69
+ ### Inference on GPU
70
+ To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
71
+
72
+ ### Training
73
+ The model was trained with SpeechBrain.
74
+ To train it from scratch follow these steps:
75
+ 1. Clone SpeechBrain:
76
+ ```bash
77
+ git clone https://github.com/speechbrain/speechbrain/
78
+ ```
79
+ 2. Install it:
80
+ ```bash
81
+ cd speechbrain
82
+ pip install -r requirements.txt
83
+ pip install -e .
84
+ ```
85
+ 3. Run Training:
86
+ ```bash
87
+ cd recipes/LJSpeech/TTS/tacotron2/
88
+ python train.py --device=cuda:0 --max_grad_norm=1.0 --data_folder=/your_folder/LJSpeech-1.1 hparams/train.yaml
89
+ ```