ai4bharat
/

indic-parler-tts

text2text-generation

Model card Files Files and versions Community

ylacombe commited on Dec 3, 2024

Commit

e890c3d

·

verified ·

1 Parent(s): 5467351

Update README.md

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -184,18 +184,19 @@ import soundfile as sf
 device = "cuda:0" if torch.cuda.is_available() else "cpu"
-model = ParlerTTSForConditionalGeneration.from_pretrained("parler-tts/parler-tts-mini-v1").to(device)
-tokenizer = AutoTokenizer.from_pretrained("parler-tts/parler-tts-mini-v1")
 prompt = "अरे, तुम आज कैसे हो?"
 description = "Divya's voice is monotone yet slightly fast in delivery, with a very close recording that almost has no background noise."
-input_ids = tokenizer(description, return_tensors="pt").input_ids.to(device)
 prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
 generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_input_ids)
 audio_arr = generation.cpu().numpy().squeeze()
-sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
 ```
 The model includes **69 speakers** across 18 officially supported languages, with each language having a set of recommended voices for optimal performance. Below is a table summarizing the available speakers for each language, along with the recommended ones.

 device = "cuda:0" if torch.cuda.is_available() else "cpu"
+model = ParlerTTSForConditionalGeneration.from_pretrained("ai4bharat/indic-parler-tts").to(device)
+tokenizer = AutoTokenizer.from_pretrained("ai4bharat/indic-parler-tts")
+description_tokenizer = AutoTokenizer.from_pretrained(model.config.text_encoder._name_or_path)
 prompt = "अरे, तुम आज कैसे हो?"
 description = "Divya's voice is monotone yet slightly fast in delivery, with a very close recording that almost has no background noise."
+input_ids = description_tokenizer(description, return_tensors="pt").input_ids.to(device)
 prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
 generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_input_ids)
 audio_arr = generation.cpu().numpy().squeeze()
+sf.write("indic_tts_out.wav", audio_arr, model.config.sampling_rate)
 ```
 The model includes **69 speakers** across 18 officially supported languages, with each language having a set of recommended voices for optimal performance. Below is a table summarizing the available speakers for each language, along with the recommended ones.