legendary2910's picture
Upload tokenizer
78643b1 verified
metadata
base_model: openai/whisper-small
language:
  - vi
license: apache-2.0
metrics:
  - wer
tags:
  - hf-asr-leaderboard
  - generated_from_trainer
model-index:
  - name: Whisper Small Mnong
    results: []

Whisper Small Mnong

This model is a fine-tuned version of openai/whisper-small on the MnongAudio-v2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1380
  • Wer: 29.9287

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
3.2102 0.1421 200 3.0988 153.0565
1.7796 0.2843 400 1.7393 146.0774
1.3216 0.4264 600 1.3372 109.1187
1.0883 0.5686 800 1.0383 101.5028
0.8187 0.7107 1000 0.8161 63.4997
0.652 0.8529 1200 0.6821 66.2252
0.5411 0.9950 1400 0.5551 58.2272
0.4082 1.1372 1600 0.4738 58.5074
0.359 1.2793 1800 0.4075 45.1859
0.2761 1.4215 2000 0.3466 43.9379
0.212 1.5636 2200 0.3002 42.0785
0.2192 1.7058 2400 0.2642 36.0927
0.1932 1.8479 2600 0.2269 39.3785
0.1541 1.9900 2800 0.2013 30.5400
0.0944 2.1322 3000 0.1894 36.6021
0.0848 2.2743 3200 0.1682 29.4447
0.0811 2.4165 3400 0.1565 28.0183
0.0899 2.5586 3600 0.1481 31.0749
0.0749 2.7008 3800 0.1409 25.6240
0.0737 2.8429 4000 0.1380 29.9287

Framework versions

  • Transformers 4.43.4
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1