Wave2Vec2-Bert2.0 - Kiran Pantha

This model is a fine-tuned version of kiranpantha/w2v-bert-2.0-nepali-unlabeled-1 on the OpenSLR54 dataset. It achieves the following results on the evaluation set:

Loss: 0.5190
Wer: 0.4497
Cer: 0.1090

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Cer	Validation Loss	Wer
0.4494	0.0375	300	0.1147	0.5118	0.4793
0.5556	0.075	600	0.1448	0.6503	0.5808
0.5684	0.1125	900	0.1418	0.6258	0.5741
0.5309	0.15	1200	0.1446	0.6867	0.5391
0.615	0.1875	1500	0.1566	0.6692	0.5844
0.5627	0.225	1800	0.1434	0.6586	0.5597
0.6188	0.2625	2100	0.1500	0.6250	0.5559
0.5888	0.3	2400	0.1624	0.6863	0.6162
0.5435	0.3375	2700	0.1551	0.6415	0.5736
0.5667	0.375	3000	0.1478	0.6041	0.5661
0.5323	0.4125	3300	0.1392	0.5805	0.5327
0.5471	0.45	3600	0.1390	0.5699	0.5327
0.5939	0.4875	3900	0.1341	0.5739	0.5169
0.5795	0.525	4200	0.1392	0.6036	0.5278
0.4974	0.5625	4500	0.1255	0.5331	0.4997
0.5247	0.6	4800	0.1300	0.5649	0.5190
0.5035	0.6375	5100	0.1292	0.5583	0.5067
0.5354	0.675	5400	0.1270	0.5472	0.5115
0.536	0.7125	5700	0.1283	0.5406	0.5012
0.498	0.75	6000	0.1331	0.5747	0.5167
0.4339	0.7875	6300	0.1266	0.5224	0.4846
0.4504	0.825	6600	0.1234	0.5549	0.4982
0.4237	0.8625	6900	0.1221	0.5376	0.4759
0.4434	0.9	7200	0.1303	0.5651	0.5080
0.443	0.9375	7500	0.1219	0.5222	0.4889
0.4282	0.975	7800	0.1247	0.5297	0.4936
0.4128	1.0125	8100	0.1230	0.5263	0.4804
0.4507	1.05	8400	0.1254	0.5548	0.4881
0.4008	1.0875	8700	0.1232	0.5411	0.4816
0.4834	1.125	9000	0.1215	0.5264	0.4853
0.3955	1.1625	9300	0.1232	0.5288	0.4876
0.3837	1.2	9600	0.1224	0.5496	0.4853
0.3819	1.2375	9900	0.5215	0.4739	0.1232
0.3771	1.275	10200	0.5115	0.4641	0.1188
0.4067	1.3125	10500	0.5274	0.4810	0.1236
0.3561	1.35	10800	0.5366	0.4739	0.1182
0.3971	1.3875	11100	0.4951	0.4669	0.1178
0.337	1.425	11400	0.5180	0.4630	0.1156
0.4031	1.4625	11700	0.4895	0.4664	0.1156
0.4278	1.5	12000	0.4858	0.4469	0.1107
0.3332	1.5375	12300	0.4986	0.4546	0.1130
0.3516	1.575	12600	0.5067	0.4677	0.1148
0.4022	1.6125	12900	0.5022	0.4638	0.1114
0.3922	1.65	13200	0.4753	0.4588	0.1130
0.3483	1.6875	13500	0.4812	0.4562	0.1135
0.3572	1.725	13800	0.4940	0.4461	0.1083
0.2796	1.7625	14100	0.4854	0.4457	0.1082
0.2555	1.8	14400	0.5231	0.4482	0.1099
0.2823	1.8375	14700	0.5126	0.4475	0.1093
0.2478	1.875	15000	0.5063	0.4458	0.1087
0.2435	1.9125	15300	0.5151	0.4409	0.1077
0.2478	1.95	15600	0.5185	0.4464	0.1084
0.2653	1.9875	15900	0.5190	0.4497	0.1090

Framework versions

Transformers 4.45.0.dev0
Pytorch 2.4.1+cu121
Datasets 2.21.0
Tokenizers 0.19.1

kiranpantha
/

w2v-bert-2.0-nepali-unlabeled-2

Wave2Vec2-Bert2.0 - Kiran Pantha

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for kiranpantha/w2v-bert-2.0-nepali-unlabeled-2

Evaluation results