Charcot-Voice_3_0

This model is a fine-tuned version of openai/whisper-large-v3-turbo on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 9e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
distributed_type: multi-GPU
num_devices: 5
gradient_accumulation_steps: 4
total_train_batch_size: 320
total_eval_batch_size: 80
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2400
num_epochs: 8
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
0.6063	0.2308	1000	0.9999	62.7936
0.8076	0.4616	2000	1.4315	92.2763
0.8017	0.6924	3000	1.4195	80.7656
0.7362	0.9231	4000	1.3165	69.8551
0.6636	1.1539	5000	1.2294	75.7631
0.6451	1.3847	6000	1.1654	69.4671
0.6148	1.6155	7000	1.1425	71.3658
0.6066	1.8463	8000	1.1058	67.0564
0.5576	2.0771	9000	1.3624	61.4640
0.5486	2.3079	10000	1.0282	61.1123
0.5282	2.5387	11000	1.2816	57.7962
0.5171	2.7694	12000	0.9579	55.6079
0.4925	3.0002	13000	0.9670	64.6249
0.4548	3.2310	14000	0.9398	55.2250
0.4432	3.4618	15000	0.9145	53.3057
0.4364	3.6926	16000	0.9406	57.8427
0.4236	3.9234	17000	1.1792	52.9177
0.385	4.1542	18000	0.8318	50.4759
0.3835	4.3850	19000	0.7850	46.2856
0.3633	4.6157	20000	1.0193	49.7569
0.3593	4.8465	21000	0.9096	50.6932
0.3198	5.0773	22000	0.9480	51.0502
0.3157	5.3081	23000	0.8693	46.8236
0.3033	5.5389	24000	0.7335	44.8112
0.3025	5.7697	25000	0.7442	45.8769
0.2873	6.0005	26000	0.9870	50.2069
0.2735	6.2312	27000	0.9005	47.9617
0.2663	6.4620	28000	0.7609	40.1035
0.2598	6.6928	29000	0.7688	42.9902
0.2549	6.9236	30000	0.7648	42.9126
0.2404	7.1544	31000	0.7478	43.0626
0.2437	7.3852	32000	0.7893	40.1604
0.2451	7.6160	33000	0.7712	41.7848
0.2389	7.8468	34000	0.7687	41.5261