Text-to-Speech
Transformers
Safetensors
parler_tts
text2text-generation
annotation
ylacombe commited on
Commit
e890c3d
·
verified ·
1 Parent(s): 5467351

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -184,18 +184,19 @@ import soundfile as sf
184
 
185
  device = "cuda:0" if torch.cuda.is_available() else "cpu"
186
 
187
- model = ParlerTTSForConditionalGeneration.from_pretrained("parler-tts/parler-tts-mini-v1").to(device)
188
- tokenizer = AutoTokenizer.from_pretrained("parler-tts/parler-tts-mini-v1")
 
189
 
190
  prompt = "अरे, तुम आज कैसे हो?"
191
  description = "Divya's voice is monotone yet slightly fast in delivery, with a very close recording that almost has no background noise."
192
 
193
- input_ids = tokenizer(description, return_tensors="pt").input_ids.to(device)
194
  prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
195
 
196
  generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_input_ids)
197
  audio_arr = generation.cpu().numpy().squeeze()
198
- sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
199
  ```
200
 
201
  The model includes **69 speakers** across 18 officially supported languages, with each language having a set of recommended voices for optimal performance. Below is a table summarizing the available speakers for each language, along with the recommended ones.
 
184
 
185
  device = "cuda:0" if torch.cuda.is_available() else "cpu"
186
 
187
+ model = ParlerTTSForConditionalGeneration.from_pretrained("ai4bharat/indic-parler-tts").to(device)
188
+ tokenizer = AutoTokenizer.from_pretrained("ai4bharat/indic-parler-tts")
189
+ description_tokenizer = AutoTokenizer.from_pretrained(model.config.text_encoder._name_or_path)
190
 
191
  prompt = "अरे, तुम आज कैसे हो?"
192
  description = "Divya's voice is monotone yet slightly fast in delivery, with a very close recording that almost has no background noise."
193
 
194
+ input_ids = description_tokenizer(description, return_tensors="pt").input_ids.to(device)
195
  prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
196
 
197
  generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_input_ids)
198
  audio_arr = generation.cpu().numpy().squeeze()
199
+ sf.write("indic_tts_out.wav", audio_arr, model.config.sampling_rate)
200
  ```
201
 
202
  The model includes **69 speakers** across 18 officially supported languages, with each language having a set of recommended voices for optimal performance. Below is a table summarizing the available speakers for each language, along with the recommended ones.