w2v-bert-2.0-lg-CV-Fleurs-filtered-100hrs-v12

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the fleurs dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 70
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.9834	1.0	7125	0.3827	0.4584	0.0921
0.1914	2.0	14250	0.3460	0.4394	0.0837
0.165	3.0	21375	0.3377	0.4375	0.0827
0.1519	4.0	28500	0.3337	0.4246	0.0805
0.1458	5.0	35625	0.3242	0.4234	0.0789
0.1413	6.0	42750	0.3294	0.4329	0.0816
0.1395	7.0	49875	0.3441	0.4431	0.0866
0.1325	8.0	57000	0.3263	0.4332	0.0867
0.1191	9.0	64125	0.3278	0.4065	0.0788
0.1075	10.0	71250	0.3203	0.4418	0.0808
0.0974	11.0	78375	0.3304	0.4036	0.0771
0.0892	12.0	85500	0.3307	0.4263	0.0819
0.0802	13.0	92625	0.3530	0.4107	0.0785
0.0728	14.0	99750	0.3478	0.4156	0.0795
0.0632	15.0	106875	0.3620	0.4052	0.0787
0.0567	16.0	114000	0.3620	0.4219	0.0796
0.0484	17.0	121125	0.4135	0.4114	0.0787
0.0423	18.0	128250	0.4220	0.4186	0.0814
0.0358	19.0	135375	0.4476	0.4303	0.0825
0.0311	20.0	142500	0.4913	0.4134	0.0806
0.0277	21.0	149625	0.4910	0.4411	0.0850
0.0238	22.0	156750	0.5097	0.4269	0.0821
0.0214	23.0	163875	0.4755	0.4248	0.0837
0.0194	24.0	171000	0.4839	0.4249	0.0826
0.0178	25.0	178125	0.5302	0.4294	0.0828
0.016	26.0	185250	0.4980	0.4385	0.0852