Update README.md
Browse files
README.md
CHANGED
@@ -33,6 +33,9 @@ This model is a fine-tuned version of [openai/whisper-small](https://huggingface
|
|
33 |
|
34 |
## Training and evaluation data
|
35 |
For training,
|
|
|
|
|
|
|
36 |
|Name|# of Hours|
|
37 |
|--|--|
|
38 |
|Common Voice 16.0 zh-HK Train|138|
|
@@ -42,8 +45,6 @@ For training,
|
|
42 |
|Pseudo-Labelled YouTube Data|438|
|
43 |
|Total|756|
|
44 |
|
45 |
-
- CantoMap: Winterstein, Grégoire, Tang, Carmen and Lai, Regine (2020) "CantoMap: a Hong Kong Cantonese MapTask Corpus", in Proceedings of The 12th Language Resources and Evaluation Conference, Marseille: European Language Resources Association, p. 2899-2906.
|
46 |
-
- Cantonse-ASR: Yu, Tiezheng, Frieske, Rita, Xu, Peng, Cahyawijaya, Samuel, Yiu, Cheuk Tung, Lovenia, Holy, Dai, Wenliang, Barezi, Elham, Chen, Qifeng, Ma, Xiaojuan, Shi, Bertram, Fung, Pascale (2022) "Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset", 2022. Link: https://arxiv.org/pdf/2201.02419.pdf
|
47 |
|
48 |
For evaluation, Common Voice 16.0 yue Test set is used.
|
49 |
|
|
|
33 |
|
34 |
## Training and evaluation data
|
35 |
For training,
|
36 |
+
- CantoMap: Winterstein, Grégoire, Tang, Carmen and Lai, Regine (2020) "CantoMap: a Hong Kong Cantonese MapTask Corpus", in Proceedings of The 12th Language Resources and Evaluation Conference, Marseille: European Language Resources Association, p. 2899-2906.
|
37 |
+
- Cantonse-ASR: Yu, Tiezheng, Frieske, Rita, Xu, Peng, Cahyawijaya, Samuel, Yiu, Cheuk Tung, Lovenia, Holy, Dai, Wenliang, Barezi, Elham, Chen, Qifeng, Ma, Xiaojuan, Shi, Bertram, Fung, Pascale (2022) "Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset", 2022. Link: https://arxiv.org/pdf/2201.02419.pdf
|
38 |
+
|
39 |
|Name|# of Hours|
|
40 |
|--|--|
|
41 |
|Common Voice 16.0 zh-HK Train|138|
|
|
|
45 |
|Pseudo-Labelled YouTube Data|438|
|
46 |
|Total|756|
|
47 |
|
|
|
|
|
48 |
|
49 |
For evaluation, Common Voice 16.0 yue Test set is used.
|
50 |
|