cahya commited on
Commit
a50a16d
1 Parent(s): 1c4bdb9

updated readme

Browse files
Files changed (1) hide show
  1. README.md +58 -9
README.md CHANGED
@@ -26,21 +26,48 @@ model-index:
26
  - name: Wer
27
  type: wer
28
  value: 3.8273540533062804
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ---
30
 
31
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
32
- should probably proofread and complete it, then remove this comment. -->
33
-
34
  # Whisper Medium Indonesian
35
 
36
- This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the mozilla-foundation/common_voice_11_0, magic_data, titml id dataset.
37
- It achieves the following results on the evaluation set:
 
 
38
  - Loss: 0.0698
39
  - Wer: 3.8274
40
-
41
- ## Model description
42
-
43
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
  ## Intended uses & limitations
46
 
@@ -80,7 +107,29 @@ The following hyperparameters were used during training:
80
  | 0.0122 | 2.98 | 9000 | 0.0714 | 3.9795 |
81
  | 0.0049 | 3.31 | 10000 | 0.0720 | 3.9887 |
82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
 
 
 
 
 
 
 
 
84
  ### Framework versions
85
 
86
  - Transformers 4.26.0.dev0
 
26
  - name: Wer
27
  type: wer
28
  value: 3.8273540533062804
29
+ - task:
30
+ name: Automatic Speech Recognition
31
+ type: automatic-speech-recognition
32
+ dataset:
33
+ name: google/fleurs id_id
34
+ type: google/fleurs
35
+ config: id_id
36
+ split: test
37
+ metrics:
38
+ - name: Wer
39
+ type: wer
40
+ value: 9.74
41
+
42
  ---
43
 
 
 
 
44
  # Whisper Medium Indonesian
45
 
46
+ This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the
47
+ Indonesian mozilla-foundation/common_voice_11_0, magic_data, titml and google/fleurs dataset. It achieves the following
48
+ results:
49
+ ### CV11 test split:
50
  - Loss: 0.0698
51
  - Wer: 3.8274
52
+ ### Google/fleurs test split:
53
+ - Wer: 9.74
54
+
55
+ ## Usage
56
+
57
+ ```python
58
+ from transformers import pipeline
59
+ transcriber = pipeline(
60
+ "automatic-speech-recognition",
61
+ model="cahya/whisper-medium-id"
62
+ )
63
+ transcriber.model.config.forced_decoder_ids = (
64
+ transcriber.tokenizer.get_decoder_prompt_ids(
65
+ language="id"
66
+ task="transcribe"
67
+ )
68
+ )
69
+ transcription = transcriber("my_audio_file.mp3")
70
+ ```
71
 
72
  ## Intended uses & limitations
73
 
 
107
  | 0.0122 | 2.98 | 9000 | 0.0714 | 3.9795 |
108
  | 0.0049 | 3.31 | 10000 | 0.0720 | 3.9887 |
109
 
110
+ ## Evaluation
111
+
112
+ We evaluated the model using the test split of two datasets, the [Common Voice 11](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0)
113
+ and the [Google Fleurs](https://huggingface.co/datasets/google/fleurs).
114
+ As Whisper can transcribe casing and punctuation, we also evaluate its performance using raw and normalized text.
115
+ (lowercase + removal of punctuations). The results are as follows:
116
+
117
+ ### Common Voice 11
118
+
119
+ | | WER |
120
+ |---------------------------------------------------------------------------|------|
121
+ | [cahya/whisper-medium-id](https://huggingface.co/cahya/whisper-medium-id) | 3.83 |
122
+ | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | tbc |
123
+
124
+ ### Google/Fleurs
125
 
126
+ | | WER |
127
+ |-------------------------------------------------------------------------------------------------------------|------|
128
+ | [cahya/whisper-medium-id](https://huggingface.co/cahya/whisper-medium-id) | 9.74 |
129
+ | [cahya/whisper-medium-id](https://huggingface.co/cahya/whisper-medium-id) + text normalization | tbc |
130
+ | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | tbc |
131
+ | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) + text normalization | tbc |
132
+ |
133
  ### Framework versions
134
 
135
  - Transformers 4.26.0.dev0