Whisper Small Vi - finetune viVoice - 70000

This model is a fine-tuned version of openai/whisper-small on the viVoice dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1.25e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 2
total_train_batch_size: 64
total_eval_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
training_steps: 80000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
0.1892	0.05	4000	3.5308	18.7775
0.1551	0.1	8000	4.2465	18.1171
0.1444	0.15	12000	4.4830	16.9775
0.1097	1.0266	16000	4.4955	16.1357
0.0966	1.0766	20000	4.8873	15.6825
0.0915	1.1266	24000	4.8408	15.6177
0.0853	2.0032	28000	5.0293	15.1904
0.065	2.0532	32000	5.0290	15.8120
0.0644	2.1032	36000	5.1940	14.5299
0.0584	2.1532	40000	5.3418	15.1515
0.0466	3.0298	44000	5.2564	15.2422
0.0405	3.0798	48000	5.4065	14.7112
0.0412	3.1298	52000	5.5395	14.1414
0.0344	4.0064	56000	5.6079	14.5947
0.0288	4.0564	60000	5.5141	14.4911
0.0257	4.1064	64000	5.6983	14.7242
0.0249	4.1564	68000	5.7079	14.0378
0.0209	5.033	72000	5.5744	13.8177
0.0192	5.083	76000	5.7272	14.1803
0.0185	5.133	80000	5.7260	14.0767