marinone94
/

whisper-medium-nordic

@@ -9,14 +9,10 @@ tags:
 - generated_from_trainer
 datasets:
 - mozilla-foundation/common_voice_11_0
-- mozilla-foundation/common_voice_11_0
-- mozilla-foundation/common_voice_11_0
 - babelbox/babelbox_voice
 - NbAiLab/NST
 - NbAiLab/NPSC
 - google/fleurs
-- google/fleurs
-- google/fleurs
 metrics:
 - wer
 model-index:
@@ -33,47 +29,34 @@ model-index:
     metrics:
     - name: Wer
       type: wer
-      value: 11.307923879152778
-  - task:
-      name: Automatic Speech Recognition
-      type: automatic-speech-recognition
-    dataset:
-      name: babelbox/babelbox_voice
-      type: babelbox/babelbox_voice
-    metrics:
-    - name: Wer
-      type: wer
-      value: 11.307923879152778
   - task:
       name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
-      name: NbAiLab/NST
-      type: NbAiLab/NST
-    metrics:
-    - name: Wer
-      type: wer
-      value: 11.307923879152778
-  - task:
-      name: Automatic Speech Recognition
-      type: automatic-speech-recognition
-    dataset:
-      name: NbAiLab/NPSC
-      type: NbAiLab/NPSC
     metrics:
     - name: Wer
       type: wer
-      value: 11.307923879152778
   - task:
       name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
-      name: google/fleurs
-      type: google/fleurs
     metrics:
     - name: Wer
       type: wer
-      value: 11.307923879152778
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -81,8 +64,9 @@ should probably proofread and complete it, then remove this comment. -->
 # Whisper Medium Nordic
-This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the mozilla-foundation/common_voice_11_0, the mozilla-foundation/common_voice_11_0, the mozilla-foundation/common_voice_11_0, the babelbox/babelbox_voice, the NbAiLab/NST, the NbAiLab/NPSC, the google/fleurs, the google/fleurs and the google/fleurs datasets.
-It achieves the following results on the evaluation set:
 - Loss: 0.2129
 - Wer: 11.3079
@@ -100,6 +84,9 @@ More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:

 - generated_from_trainer
 datasets:
 - mozilla-foundation/common_voice_11_0
 - babelbox/babelbox_voice
 - NbAiLab/NST
 - NbAiLab/NPSC
 - google/fleurs
 metrics:
 - wer
 model-index:
     metrics:
     - name: Wer
       type: wer
+      value: 11.31
   - task:
       name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
+      name: mozilla-foundation/common_voice_11_0
+      type: mozilla-foundation/common_voice_11_0
+      config: da
+      split: test
     metrics:
     - name: Wer
       type: wer
+      value: 14.86
   - task:
       name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
+      name: mozilla-foundation/common_voice_11_0
+      type: mozilla-foundation/common_voice_11_0
+      config: nn-NO
+      split: test
     metrics:
     - name: Wer
       type: wer
+      value: 37.02
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # Whisper Medium Nordic
+This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the [mozilla-foundation/common_voice_11_0](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0) (sv-SE, da, nn-NO), the [babelbox/babelbox_voice](https://huggingface.co/datasets/babelbox/babelbox_voice) (Swedish radio), the [NbAiLab/NST](https://huggingface.co/datasets/NbAiLab/NST) (Norwegian radio), the [NbAiLab/NPSC](https://huggingface.co/datasets/NbAiLab/NPSC) (Norwegian parliament) and the [google/fleurs](https://huggingface.co/datasets/google/fleurs) (sv_se, da_dk, nb_no) datasets. The goal is to leverage transfer learning across Nordic languages, which have strong similarities.
+It achieves the following results on the common voice Swedish test set:
 - Loss: 0.2129
 - Wer: 11.3079
 ## Training procedure
+Please note that a bug during training prevented us from evaluating WER correctly.
+Validation loss suggests we started overfitting after 5000/6000 steps.
 ### Training hyperparameters
 The following hyperparameters were used during training: