|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
``` |
|
!pip install git+https://github.com/huggingface/parler-tts.git |
|
``` |
|
|
|
Quick Start |
|
``` |
|
from parler_tts import ParlerTTSForConditionalGeneration |
|
from transformers import AutoTokenizer |
|
import torch |
|
|
|
device = "cuda:0" if torch.cuda.is_available() else "cpu" |
|
|
|
# model = ParlerTTSForConditionalGeneration.from_pretrained("/kaggle/working/parler-tts/output_dir_training", torch_dtype=torch.float16).to(device) |
|
# tokenizer = AutoTokenizer.from_pretrained("parler-tts/parler_tts_mini_v0.1") |
|
|
|
model = ParlerTTSForConditionalGeneration.from_pretrained("Cintin/parler-tts-mini-Jenny-colab").to(device) |
|
tokenizer = AutoTokenizer.from_pretrained("Cintin/parler-tts-mini-Jenny-colab") |
|
|
|
prompt = "Hey, how are you doing today?" |
|
description = "'Jenny delivers her words quite expressively, in a very confined sounding environment with clear audio quality. She speaks fast.'" |
|
|
|
input_ids = tokenizer(description, return_tensors="pt").input_ids.to(device) |
|
prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device) |
|
|
|
generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_input_ids) |
|
audio_arr = generation.cpu().numpy().squeeze() |
|
``` |
|
|
|
To play the audio |
|
``` |
|
from IPython.display import Audio |
|
Audio(audio_arr, rate=model.config.sampling_rate) |
|
``` |