metadata

language: mn
license: apache-2.0
tags:
  - whisper-event
  - hf-asr-leaderboard
  - generated_from_multiple_datasets
datasets:
  - mozilla-foundation/common_voice_11_0
  - google/fleurs
  - bayartsogt/ulaanbal-v0
  - bayartsogt/youtube-mongolian-v1
metrics:
  - wer
  - cer
model-index:
  - name: whisper-medium-mn-10
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: Common Voice 11.0
          type: mozilla-foundation/common_voice_11_0
          config: mn
          split: test
        metrics:
          - type: wer
            value: 21.258466244264802
            name: Wer
          - type: cer
            value: 6.875610660018193
            name: Cer

whisper-medium-mn-10

This model is a fine-tuned version of openai/whisper-medium on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2103
Wer: 21.2585
Cer: 6.8756

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 40000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Cer	Validation Loss	Wer
0.4197	0.09	1000	19.0947	0.4462	53.9600
0.3288	0.17	2000	14.8016	0.3468	44.2102
0.2737	0.26	3000	12.3471	0.3020	36.1700
0.2558	0.35	4000	11.7171	0.2824	34.1709
0.2406	0.43	5000	10.3551	0.2594	31.1230
0.218	0.52	6000	9.7815	0.2452	29.6865
0.2253	0.61	7000	9.6712	0.2344	29.2932
0.2071	0.69	8000	9.4261	0.2283	28.5067
0.2051	0.78	9000	9.0656	0.2224	27.4033
0.2064	0.87	10000	8.7851	0.2138	26.7206
0.193	0.95	11000	8.5021	0.2089	25.5790
0.1577	1.04	12000	8.2873	0.2072	25.6118
0.1397	1.13	13000	8.2368	0.2046	25.1147
0.1526	1.21	14000	8.7615	0.2065	26.4638
0.1497	1.3	15000	0.2004	24.4866	7.9588
0.1569	1.39	16000	0.1990	24.2244	7.9554
0.1416	1.47	17000	0.2001	24.2298	7.8754
0.1371	1.56	18000	0.1932	23.6072	7.8072
0.1379	1.65	19000	0.1916	23.1320	7.5452
0.1305	1.73	20000	0.1880	23.1101	7.4290
0.1395	1.82	21000	0.1877	22.9845	7.4635
0.1418	1.91	22000	0.1862	22.9080	7.5907
0.1432	1.99	23000	0.1847	22.7114	7.4290
0.0965	2.08	24000	0.1931	21.7391	7.0399
0.0723	2.17	25000	0.1961	22.3236	7.2698
0.0773	2.25	26000	0.1977	22.0505	7.0752
0.0862	2.34	27000	0.1959	21.9522	7.0820
0.0739	2.43	28000	0.1982	21.7719	7.1494
0.0843	2.51	29000	0.1963	21.8921	7.1241
0.0734	2.6	30000	0.1980	21.7883	7.1317
0.0785	2.69	31000	0.1955	21.8757	7.1948
0.0691	2.77	32000	0.1978	21.7446	7.0938
0.0834	2.86	33000	0.1953	21.3240	7.0121
0.0675	2.95	34000	0.1958	21.7719	7.0769
0.042	3.03	35000	0.2053	21.3404	6.9624
0.0474	3.12	36000	0.2097	21.5534	7.0306
0.0428	3.21	37000	0.2107	21.3185	6.9809
0.0343	3.29	38000	0.2111	21.3896	6.9514
0.0378	3.38	39000	0.2103	21.2585	6.8756
0.0361	3.47	40000	0.2106	21.3677	6.9009

Framework versions

Transformers 4.26.0.dev0
Pytorch 1.13.0+cu117
Datasets 2.7.1.dev0
Tokenizers 0.13.2