SalahZa's picture
first commit
0d1350d
|
raw
history blame
No virus
1.04 kB

Tunisian Arabic ASR Model with wav2vec2

This repository provides all the necessary tools to perform automatic speech recognition from an end-to-end system pretrained on Tunisian arabic dialect

Performance

the performance of the mode is :

Release Version WER (%) CER (%)
v1.0 Without LM 11.82 6.33

Dataset

This ASR model was trained on :

  • TARIC : The corpus, named TARIC (Tunisian Arabic Railway Interaction Corpus) has a collection of audio recordings and transcriptions from dialogues in the Tunisian Railway Transport Network. - Taric Corpus -
  • STAC :A corpus of spoken Tunisian Arabic - STAC Corpus
  • IWSLT : A Tunisian conversational speech - IWSLT Corpus-
  • Tunspeech : Our custom dataset

Install

pip install speechbrain transformers