PereLluis13 commited on
Commit
b9f77f2
1 Parent(s): fa16ca3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -8
README.md CHANGED
@@ -8,28 +8,86 @@ tags:
8
  - collectivat/tv3_parla
9
  - projecte-aina/parlament_parla
10
  - generated_from_trainer
 
 
 
 
 
11
  model-index:
12
  - name: wav2vec2-xls-r-300m-ca
13
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ---
15
 
16
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
- should probably proofread and complete it, then remove this comment. -->
18
-
19
  # wav2vec2-xls-r-300m-ca
20
 
21
- This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - CA dataset.
22
- It achieves the following results on the evaluation set:
23
  - Loss: 0.2472
24
  - Wer: 0.1499
25
 
26
  ## Model description
27
 
28
- More information needed
29
 
30
  ## Intended uses & limitations
31
 
32
- More information needed
33
 
34
  ## Training and evaluation data
35
 
@@ -37,6 +95,8 @@ More information needed
37
 
38
  ## Training procedure
39
 
 
 
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
@@ -54,6 +114,8 @@ The following hyperparameters were used during training:
54
 
55
  ### Training results
56
 
 
 
57
  | Training Loss | Epoch | Step | Validation Loss | Wer |
58
  |:-------------:|:-----:|:-----:|:---------------:|:------:|
59
  | 6.2099 | 0.09 | 500 | 3.4125 | 1.0 |
@@ -126,3 +188,7 @@ The following hyperparameters were used during training:
126
  - Pytorch 1.10.1+cu102
127
  - Datasets 1.18.3
128
  - Tokenizers 0.11.0
 
 
 
 
 
8
  - collectivat/tv3_parla
9
  - projecte-aina/parlament_parla
10
  - generated_from_trainer
11
+ - robust-speech-event
12
+ datasets:
13
+ - mozilla-foundation/common_voice_8_0
14
+ - collectivat/tv3_parla
15
+ - projecte-aina/parlament_parla
16
  model-index:
17
  - name: wav2vec2-xls-r-300m-ca
18
+ results:
19
+ - task:
20
+ name: Speech Recognition
21
+ type: automatic-speech-recognition
22
+ dataset:
23
+ name: mozilla-foundation/common_voice_8_0 ca
24
+ type: mozilla-foundation/common_voice_8_0
25
+ args: ca
26
+ metrics:
27
+ - name: Test WER
28
+ type: wer
29
+ value: 0.13170091241317552
30
+ - name: Test CER
31
+ type: cer
32
+ value: 0.03356726205534543
33
+ - task:
34
+ name: Speech Recognition
35
+ type: automatic-speech-recognition
36
+ dataset:
37
+ name: projecte-aina/parlament_parla ca
38
+ type: projecte-aina/parlament_parla
39
+ args: clean
40
+ metrics:
41
+ - name: Test WER
42
+ type: wer
43
+ value: 0.08048005647723261
44
+ - name: Test CER
45
+ type: cer
46
+ value: 0.02240912911020065
47
+ - task:
48
+ name: Speech Recognition
49
+ type: automatic-speech-recognition
50
+ dataset:
51
+ name: collectivat/tv3_parla ca
52
+ type: collectivat/tv3_parla
53
+ args: ca
54
+ metrics:
55
+ - name: Test WER
56
+ type: wer
57
+ value: 0.23320629787889285
58
+ - name: Test CER
59
+ type: cer
60
+ value: 0.10439216202089989
61
+ - task:
62
+ name: Speech Recognition
63
+ type: automatic-speech-recognition
64
+ dataset:
65
+ name: speech-recognition-community-v2/dev_data ca
66
+ type: speech-recognition-community-v2/dev_data
67
+ args: ca
68
+ metrics:
69
+ - name: Test WER
70
+ type: wer
71
+ value: 0.3199671115046487
72
+ - name: Test CER
73
+ type: cer
74
+ value: 0.15820020687277325
75
  ---
76
 
 
 
 
77
  # wav2vec2-xls-r-300m-ca
78
 
79
+ This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - CA, the [tv3_parla](https://huggingface.co/datasets/collectivat/tv3_parla) and [parlament_parla](https://huggingface.co/datasets/projecte-aina/parlament_parla) datasets.
80
+ It achieves the following results on the evaluation set (for the three datasets):
81
  - Loss: 0.2472
82
  - Wer: 0.1499
83
 
84
  ## Model description
85
 
86
+ Please check the original [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) Model card. This is just a finetuned version of that model.
87
 
88
  ## Intended uses & limitations
89
 
90
+ As any model trained on crowdsourced data, this model can show the biases and particularities of the data and model used to train this model. Moreover, since this is a speech recognition model, it may underperform for some lower-resourced dialects for the catalan language.
91
 
92
  ## Training and evaluation data
93
 
 
95
 
96
  ## Training procedure
97
 
98
+ The data is preprocessed to remove characters not on the catalan alphabet. Moreover, numbers are verbalized using code provided by [@ccoreilly](https://github.com/ccoreilly), which can be found on the text/ folder or [here](https://github.com/CollectivaT-dev/catotron-cpu/blob/master/text/numbers_ca.py).
99
+
100
  ### Training hyperparameters
101
 
102
  The following hyperparameters were used during training:
 
114
 
115
  ### Training results
116
 
117
+ Check the Tensorboard tab to check the training profile and evaluation results along training. The model was evaluated on the test splits for each of the datasets used during training.
118
+
119
  | Training Loss | Epoch | Step | Validation Loss | Wer |
120
  |:-------------:|:-----:|:-----:|:---------------:|:------:|
121
  | 6.2099 | 0.09 | 500 | 3.4125 | 1.0 |
 
188
  - Pytorch 1.10.1+cu102
189
  - Datasets 1.18.3
190
  - Tokenizers 0.11.0
191
+
192
+ # Thanks
193
+
194
+ Want to thank both [@ccoreilly](https://github.com/ccoreilly) and [@gullabi](https://github.com/gullabi) who have contributed with their own resources and knowledge into making this model possible.