SalahZa commited on
Commit
d451434
1 Parent(s): 00a166a

better readme

Browse files
Files changed (1) hide show
  1. README.md +64 -12
README.md CHANGED
@@ -1,17 +1,48 @@
1
- # Tunisian Arabic ASR Model with wav2vec2
2
 
3
- This repository provides all the necessary tools to perform automatic speech recognition from an end-to-end system pretrained on Tunisian arabic dialect
 
 
 
4
 
5
  ## Performance
6
- The following table summarizes the performance of the model on various considered test sets :
7
 
8
- | Dataset | CER | WER |
9
- |-------------- |------- |------- |
10
- | TARIC | 6.22 | 10.55 |
11
- | IWSLT | 21.18 | 39.53 |
12
- | TunSwitch TO | 9.67 | 25.54 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- More details about the test sets, and the conditions leading to this performance in the paper.
15
 
16
  ## Datasets
17
  This ASR model was trained on :
@@ -19,11 +50,32 @@ This ASR model was trained on :
19
  * IWSLT : A Tunisian conversational speech - [IWSLT Corpus](https://iwslt.org/2022/dialect)-
20
  * TunSwitch : Our crowd-collected dataset described in the paper presented below.
21
 
 
 
 
 
22
 
23
 
24
  ## Inference
25
- ## Install
26
- ```python
27
- pip install speechbrain transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  ```
29
 
 
 
 
 
1
+ # Overview
2
 
3
+ This project aims to create an Automatic Speech Recognition (ASR) model dedicated for the Tunisian Arabic dialect. The goal is to improve speech recognition technology for underrepresented linguistic communities by transcribing Tunisian dialect speech into written text.
4
+
5
+ ## Dataset
6
+ All the audio and text data collected to train the model have been provided for free to encourage and support research within the community. Please find the paper [here](https://zenodo.org/record/8342762).
7
 
8
  ## Performance
 
9
 
10
+ The following table summarizes the performance of the model on various considered test sets :
11
+
12
+ | Dataset | CER | WER |
13
+ | :-------- | :------- | :------------------------- |
14
+ | `TARIC` | `6.22%` | `10.55%` |
15
+ | `IWSLT` | `21.18%` | `39.53%` |
16
+ | `TunSwitch TO` | `9.67%` | `25.54%` |
17
+
18
+ More details about the test sets, and the conditions leading to this performance in the paper.
19
+
20
+
21
+
22
+ ## Team
23
+
24
+ Here are the team members who have contributed to this project
25
+
26
+ * [Salah Zaiem](https://fr.linkedin.com/in/salah-zaiem)
27
+ * [Ahmed Amine Ben Aballah](https://www.linkedin.com/in/aabenz/)
28
+ * [Ata Kaboudi](https://www.linkedin.com/in/ata-kaboudi-63365b1a8)
29
+ * [Amir Kanoun](https://tn.linkedin.com/in/ahmed-amir-kanoun)
30
+
31
+ ## Paper
32
+ More in-depth details and insights are available in a released preprint. Please find the paper [here](https://arxiv.org/abs/2309.11327).
33
+ If you use or refer to this model, please cite :
34
+
35
+ ```
36
+ @misc{abdallah2023leveraging,
37
+ title={Leveraging Data Collection and Unsupervised Learning for Code-switched Tunisian Arabic Automatic Speech Recognition},
38
+ author={Ahmed Amine Ben Abdallah and Ata Kabboudi and Amir Kanoun and Salah Zaiem},
39
+ year={2023},
40
+ eprint={2309.11327},
41
+ archivePrefix={arXiv},
42
+ primaryClass={eess.AS}
43
+ }
44
+ ```
45
 
 
46
 
47
  ## Datasets
48
  This ASR model was trained on :
 
50
  * IWSLT : A Tunisian conversational speech - [IWSLT Corpus](https://iwslt.org/2022/dialect)-
51
  * TunSwitch : Our crowd-collected dataset described in the paper presented below.
52
 
53
+ ## Demo
54
+ Here is a working live demo : [LINK](https://huggingface.co/spaces/SalahZa/Code-Switched-Tunisian-SpeechToText)
55
+
56
+
57
 
58
 
59
  ## Inference
60
+
61
+ ### 1. Create a CSV test file
62
+ First, you have to create a csv file that follow SpeechBrain's format which contain 4 columns:
63
+ * ID: contain ID to identify each audio sample in the dataset
64
+ * wav: contain the path to the audio file
65
+ * wrd: contain the text transcription of the spoken content in the audio file
66
+ * duration: the duration of the audio in seconds
67
+
68
+
69
+ ### 2. Adjust the hyperparams.yaml file
70
+
71
+ Adjust the path of **test_csv** parameter to your csv file path
72
+
73
+
74
+ To run this recipe, do the following:
75
+ ```
76
+ > python train_with_wavlm.py semi_wavlm_large_tunisian_ctc/1234/hyperparams.yaml --test_csv = path_to_csv
77
  ```
78
 
79
+ If you want to infer on single files, the space demo offers proper easy-to-use inference code.
80
+
81
+