---
language:
- lg
- sw
tags:
- text-to-speech
- TTS
- speech-synthesis
- VITS
license: cc-by-4.0
datasets:
- mozilla-foundation/common_voice_13_0
pipeline_tag: text-to-speech
---
# Text-to-Speech (TTS) with VITS trained on Kiswahili and Luganda Common Voice
This repository provides all the necessary tools for Text-to-Speech (TTS) with Coqui TTS using a [VITS](https://arxiv.org/abs/2106.06103) fine-tuned on Kiswahili and Luganda Common Voice v13 from six speakers of a similar intonation.
The pre-trained model takes in as input a text and produces a waveform/audio in output.
# How to Synthesize Speech using our models
First, you need to install TTS
```
pip install TTS
```
### Perform Text-to-Speech (TTS)
```python
from TTS.utils.synthesizer import Synthesizer
synthesizer = Synthesizer(
"",
"",
None,
None,
None,
None,
None,
None,
None,
)
sentence_to_synthesize = "Your Kiswahili or Luganda sentence here"
if sentence_to_synthesize:
print(sentence_to_synthesize)
wav = synthesizer.tts(sentence_to_synthesize, None, None, None)
location = "output.wav" # Choose a desired name for the output file
synthesizer.save_wav(wav, location)
```
### Limitations
We do not provide any warranty on the performance achieved by this model when used on other datasets.
# **Citing**
Please, cite our work if you use our models for your research or business.
```bibtex
@inproceedings{buildingTTS,
title={Building a Luganda Text-to-Speech Model from Crowdsourced Data},
author={Kagumire, Sulaiman and Katumba, Andrew and Nakatumba-Nabende, Joyce and Quinn, John},
booktitle={5th Workshop on African Natural Language Processing},
year ={2024}
}
```