Spaces:
Runtime error
Runtime error
Zhenhong
commited on
Commit
β’
2a68db5
1
Parent(s):
4ae73fd
Modified app.py
Browse files- App demo.jpg +0 -0
- app.py +5 -2
App demo.jpg
ADDED
app.py
CHANGED
@@ -60,11 +60,14 @@ def predict(text, speaker):
|
|
60 |
title = "Text-to-Speech based on SpeechT5"
|
61 |
|
62 |
description = """
|
63 |
-
The <b>SpeechT5</b> model is pre-trained on text as well as speech inputs, with targets that are also a mix of text and speech.
|
|
|
64 |
|
65 |
This space demonstrates the <b>text-to-speech</b> (TTS) checkpoint for the English language.
|
66 |
|
67 |
-
<b>How to use:</b> Enter some English text and choose a speaker. The output is a mel spectrogram, which is converted to a mono 16 kHz waveform by the
|
|
|
|
|
68 |
"""
|
69 |
|
70 |
article = """
|
|
|
60 |
title = "Text-to-Speech based on SpeechT5"
|
61 |
|
62 |
description = """
|
63 |
+
The <b>SpeechT5</b> model is pre-trained on text as well as speech inputs, with targets that are also a mix of text and speech.
|
64 |
+
By pre-training on text and speech at the same time, it learns unified representations for both, resulting in improved modeling capabilities.
|
65 |
|
66 |
This space demonstrates the <b>text-to-speech</b> (TTS) checkpoint for the English language.
|
67 |
|
68 |
+
<b>How to use:</b> Enter some English text and choose a speaker. The output is a mel spectrogram, which is converted to a mono 16 kHz waveform by the
|
69 |
+
HiFi-GAN vocoder. Because the model always applies random dropout, each attempt will give slightly different results.
|
70 |
+
The <em>Surprise Me!</em> option creates a completely randomized speaker.
|
71 |
"""
|
72 |
|
73 |
article = """
|