korean-small_t36 / README.md
jsfamily's picture
Upload tokenizer
b3e6a4d verified
metadata
language:
  - ko
license: apache-2.0
tags:
  - hf-asr-leaderboard
  - generated_from_trainer
base_model: openai/whisper-small
datasets:
  - korean_samll_dataset13
model-index:
  - name: korean-small_t36
    results: []

korean-small_t36

This model is a fine-tuned version of openai/whisper-small on the korean_samll_dataset13 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2805
  • Cer: 10.8675

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer
0.3938 0.11 200 0.3982 14.8401
0.3479 0.21 400 0.3561 12.9310
0.3206 0.32 600 0.3334 12.3517
0.3094 0.42 800 0.3222 12.0216
0.3088 0.53 1000 0.3120 11.6705
0.2792 0.63 1200 0.3058 11.9337
0.2877 0.74 1400 0.2988 11.9042
0.2722 0.84 1600 0.2913 11.6501
0.285 0.95 1800 0.2881 11.5122
0.1822 1.05 2000 0.2870 12.0730
0.1829 1.16 2200 0.2861 11.0178
0.1843 1.26 2400 0.2850 11.4228
0.1869 1.37 2600 0.2844 11.1706
0.1886 1.47 2800 0.2826 11.0313
0.1816 1.58 3000 0.2805 10.8675
0.1828 1.69 3200 0.2792 11.0108
0.1844 1.79 3400 0.2774 10.9839
0.1847 1.9 3600 0.2747 11.2211
0.1759 2.0 3800 0.2742 11.2830
0.112 2.11 4000 0.2814 11.5537
0.1185 2.21 4200 0.2825 10.9629
0.1142 2.32 4400 0.2812 11.4553
0.1079 2.42 4600 0.2812 11.3894
0.1139 2.53 4800 0.2811 11.0738
0.1085 2.63 5000 0.2811 11.3989
0.1096 2.74 5200 0.2807 11.0138
0.1087 2.84 5400 0.2804 11.1387
0.1103 2.95 5600 0.2801 11.1097

Framework versions

  • Transformers 4.39.0.dev0
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2