Edit model card

Wave2Vec2-Bert2.0 - Kiran Pantha

This model is a fine-tuned version of kiranpantha/w2v-bert-2.0-nepali-unlabeled-1 on the OpenSLR54 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5190
  • Wer: 0.4497
  • Cer: 0.1090

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss Wer
0.4494 0.0375 300 0.1147 0.5118 0.4793
0.5556 0.075 600 0.1448 0.6503 0.5808
0.5684 0.1125 900 0.1418 0.6258 0.5741
0.5309 0.15 1200 0.1446 0.6867 0.5391
0.615 0.1875 1500 0.1566 0.6692 0.5844
0.5627 0.225 1800 0.1434 0.6586 0.5597
0.6188 0.2625 2100 0.1500 0.6250 0.5559
0.5888 0.3 2400 0.1624 0.6863 0.6162
0.5435 0.3375 2700 0.1551 0.6415 0.5736
0.5667 0.375 3000 0.1478 0.6041 0.5661
0.5323 0.4125 3300 0.1392 0.5805 0.5327
0.5471 0.45 3600 0.1390 0.5699 0.5327
0.5939 0.4875 3900 0.1341 0.5739 0.5169
0.5795 0.525 4200 0.1392 0.6036 0.5278
0.4974 0.5625 4500 0.1255 0.5331 0.4997
0.5247 0.6 4800 0.1300 0.5649 0.5190
0.5035 0.6375 5100 0.1292 0.5583 0.5067
0.5354 0.675 5400 0.1270 0.5472 0.5115
0.536 0.7125 5700 0.1283 0.5406 0.5012
0.498 0.75 6000 0.1331 0.5747 0.5167
0.4339 0.7875 6300 0.1266 0.5224 0.4846
0.4504 0.825 6600 0.1234 0.5549 0.4982
0.4237 0.8625 6900 0.1221 0.5376 0.4759
0.4434 0.9 7200 0.1303 0.5651 0.5080
0.443 0.9375 7500 0.1219 0.5222 0.4889
0.4282 0.975 7800 0.1247 0.5297 0.4936
0.4128 1.0125 8100 0.1230 0.5263 0.4804
0.4507 1.05 8400 0.1254 0.5548 0.4881
0.4008 1.0875 8700 0.1232 0.5411 0.4816
0.4834 1.125 9000 0.1215 0.5264 0.4853
0.3955 1.1625 9300 0.1232 0.5288 0.4876
0.3837 1.2 9600 0.1224 0.5496 0.4853
0.3819 1.2375 9900 0.5215 0.4739 0.1232
0.3771 1.275 10200 0.5115 0.4641 0.1188
0.4067 1.3125 10500 0.5274 0.4810 0.1236
0.3561 1.35 10800 0.5366 0.4739 0.1182
0.3971 1.3875 11100 0.4951 0.4669 0.1178
0.337 1.425 11400 0.5180 0.4630 0.1156
0.4031 1.4625 11700 0.4895 0.4664 0.1156
0.4278 1.5 12000 0.4858 0.4469 0.1107
0.3332 1.5375 12300 0.4986 0.4546 0.1130
0.3516 1.575 12600 0.5067 0.4677 0.1148
0.4022 1.6125 12900 0.5022 0.4638 0.1114
0.3922 1.65 13200 0.4753 0.4588 0.1130
0.3483 1.6875 13500 0.4812 0.4562 0.1135
0.3572 1.725 13800 0.4940 0.4461 0.1083
0.2796 1.7625 14100 0.4854 0.4457 0.1082
0.2555 1.8 14400 0.5231 0.4482 0.1099
0.2823 1.8375 14700 0.5126 0.4475 0.1093
0.2478 1.875 15000 0.5063 0.4458 0.1087
0.2435 1.9125 15300 0.5151 0.4409 0.1077
0.2478 1.95 15600 0.5185 0.4464 0.1084
0.2653 1.9875 15900 0.5190 0.4497 0.1090

Framework versions

  • Transformers 4.45.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
30
Safetensors
Model size
606M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kiranpantha/w2v-bert-2.0-nepali-unlabeled-2

Evaluation results