metadata

language:
  - 'no'
license: apache-2.0
tags:
  - audio
  - asr
  - automatic-speech-recognition
  - hf-asr-leaderboard
model-index:
  - name: scream_duo_dropout_bpe_dropout
    results: []

scream_duo_dropout_bpe_dropout

This model is a fine-tuned version of openai/whisper-small on the NbAiLab/NCC_speech_all_v5 dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
lr_scheduler_type: linear
per_device_train_batch_size: 32
total_train_batch_size_per_node: 128
total_train_batch_size: 1024
total_optimization_steps: 20,000
starting_optimization_step: None
finishing_optimization_step: 20,000
num_train_dataset_workers: 32
num_hosts: 8
total_num_training_examples: 20,480,000
steps_per_epoch: 1314
num_beams: 5

Training results

step	eval_loss	train_loss	eval_wer	eval_cer
0	4.2078	2.2707	169.1230	127.5435
1000	3.2725	0.8044	18.8490	7.9113
2000	2.8829	0.7114	12.7893	5.2205
3000	2.5190	0.6461	11.6931	5.1247
4000	2.5935	0.5883	10.8100	4.8325
5000	2.3213	0.5753	10.7186	4.8677
6000	3.2012	0.5495	10.6273	5.0491
7000	2.9775	0.5279	10.8100	5.0441
8000	3.1645	0.5412	9.9574	4.8929
9000	2.9499	0.5191	11.0840	7.1504
10000	3.7657	0.5215	10.5968	5.0189
11000	3.7694	0.5086	9.9574	4.8375
12000	3.9640	0.5121	10.3228	4.9685
13000	4.2364	0.4982	10.3532	5.0088
14000	4.5940	0.4908	9.9574	4.8627
15000	4.7101	0.4696	10.1401	5.0239
16000	4.7501	0.4680	9.8965	4.7317
17000	4.9145	0.4751	10.0792	5.0239

Framework versions

Transformers 4.29.0.dev0
Datasets 2.12.0
Tokenizers 0.13.3