Suchae's picture
End of training
ce4847d verified
metadata
library_name: transformers
language:
  - ko
license: apache-2.0
base_model: openai/whisper-large-v3
tags:
  - generated_from_trainer
datasets:
  - Suchae/whisper-large-v3-ko-middlesenior-dialect-speech-v1.1
model-index:
  - name: Suchae/whisper-large-v3-ko-middlesenior-dialect-speech-v1.1
    results: []

Suchae/whisper-large-v3-ko-middlesenior-dialect-speech-v1.1

This model is a fine-tuned version of openai/whisper-large-v3 on the Suchae/whisper-large-v3-ko-middlesenior-dialect-speech-v1.1 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5249
  • Cer: 14.4500

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 10
  • eval_batch_size: 5
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 80
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer
1.333 0.0548 64 0.9131 19.4136
1.1099 0.1096 128 0.7550 17.2538
0.9577 0.1643 192 0.6955 17.2038
0.9198 0.2191 256 0.6615 15.8003
0.7995 0.2739 320 0.6357 16.5130
0.7898 0.3287 384 0.6150 15.8066
0.7344 0.3835 448 0.6022 14.9533
0.7035 0.4383 512 0.5846 14.3594
0.6936 0.4930 576 0.5711 16.6193
0.6427 0.5478 640 0.5602 14.7063
0.6365 0.6026 704 0.5530 15.0095
0.6107 0.6574 768 0.5440 14.5813
0.596 0.7122 832 0.5379 15.2315
0.5831 0.7670 896 0.5357 15.1377
0.5542 0.8217 960 0.5308 15.0314
0.5675 0.8765 1024 0.5277 15.2252
0.532 0.9313 1088 0.5252 15.6722
0.5255 0.9861 1152 0.5249 14.4500

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu118
  • Datasets 3.0.0
  • Tokenizers 0.19.1