بسم اله الرحمن الرحیم - هست کلید در گنج حکیم

Model Card for Khadijah(SA)

This is the first persian/english text-to-speech model using the brand new matcha TTS model.

Much faster and better than VITS.

Works best with the UNIVERSAL_V1_22050Hz hifigan vocoder.

You can test this model here under persian+english part.

Enjoy!

Usage with the Sherpa-onnx repo

Remember to add metadata to onnx file as in: https://github.com/k2-fsa/icefall/blob/master/egs/ljspeech/TTS/matcha/export_onnx.py#L174

Usage with the Matcha-TTS repo

  1. In matcha/text/cleaners.py, phonemizer.backend.EspeakBackend part:
    language="fa",
  1. pip install piper-phonemize

  2. In cleaners.py:

add below persian_cleaners_piper:

import piper_phonemize
def persian_cleaners_piper(text):
    """Pipeline for Persian text, including abbreviation expansion. + punctuation + stress"""
    #text = convert_to_ascii(text)
    text = lowercase(text)
    text = expand_abbreviations(text)
    phonemes = "".join(piper_phonemize.phonemize_espeak(text=text, voice="fa")[0])
    phonemes = collapse_whitespace(phonemes)
    
    # Remove unwanted symbols (e.g., '1')
    unwanted_symbols = {'1', '-'}  # Add any other unwanted symbols here
    filtered_phonemes = "".join([char for char in phonemes if char not in unwanted_symbols])
    
    return filtered_phonemes
  1. In matcha/text/cleaners.py change this line to:
    intersperse(text_to_sequence(text, ["persian_cleaners_piper"])[0], 0),
  1. Also set cleaner in configs/data/custom.yaml: cleaners: [persian_cleaners_piper]

  2. replace symbols.py by:

def read_tokens():
    tokens = []
    with open("/home/oem/Basir/TTS/Matcha/Matcha-TTS/configs/tokens/tokens_sherpa_with_fa.txt", "r", encoding="utf-8") as f:
        for line in f:
            # Remove the newline character at the end
            line = line.rstrip("\n")
            # Split into token and number, preserving whitespace
            if " " in line:
                token = line[:line.index(" ")]  # Extract everything before the first space
                if len(token) == 0: # White-space
                    token = ' '
            else:
                token = line  # If there's no space, the entire line is the token
            tokens.append(token)
    return tokens

symbols = read_tokens()
  1. For possible errors, change save_figure_to_numpy to:
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import io

def save_figure_to_numpy(fig):
    buf = io.BytesIO()
    fig.savefig(buf, format='png', bbox_inches='tight', pad_inches=0)
    buf.seek(0)
    img = Image.open(buf)
    data = np.array(img)
    buf.close()
    
    return data
  1. After exporting to onnx, add sherpa metadata if you want to use the model with sherpa
python3 ./add_sherpa_metadata_to_matcha.py

Training results

Training Results

Credits

Trained by Ali Mahmoudi (@mah92)

Special thanks to Masoud Azizi (@Mablue ), Amirreza Ramezani (@brightening-eyes ), and Dr. Hamid Jafari (Khaneh Noor Iranian Basir).

Special thanks to people from @ttsfarsi channel.

I should also thank you @csukuangfj from Xiaomi corporation for your helps and cares in icefall and sherpa-onnx repos.

و ما نحن بشئ الا بما رحم ربنا

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train mah92/Khadijah-FA_EN-Matcha-TTS-Model