Edit model card

Shona Text-to-Speech

This repository contains the Shona (sna) language text-to-speech (TTS) model checkpoint.

Model Details

Model Description

  • Developed by: Fastino Mateteva
  • Model type: Text to Speech
  • Language(s) (NLP): Shona
  • Finetuned from model: SpeechT5

Usage

pip install --upgrade transformers accelerate

Then, run inference with the following code-snippet:


# Load model directly
from transformers import AutoTokenizer, AutoModelForTextToWaveform

tokenizer = AutoTokenizer.from_pretrained("Fastino06/ff")
model = AutoModelForTextToWaveform.from_pretrained("Fastino06/ff")


text = "some example text in the Shona language"
inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    output = model(**inputs).waveform

The resulting waveform can be saved as a .wav file:

import scipy

scipy.io.wavfile.write("fassy.wav", rate=model.config.sampling_rate, data=output)

Or displayed in a Jupyter Notebook / Google Colab:

from IPython.display import Audio

Audio(output, rate=model.config.sampling_rate)

BibTex citation

This model was developed by Fastino Mateteva

.

Downloads last month
43
Safetensors
Model size
36.3M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using Fastino06/TTS_shona 1