Edit model card

Whisper Tiny Hu

This model is a fine-tuned version of openai/whisper-base on the sarpba/big_audio_data_hun dataset 40% 475000 lines from database (475h voice). It achieves the following results on the evaluation set:

  • Loss: 0.8324
  • Wer Ortho: 40.6270
  • Wer: 36.4265

Quanted model tests on google/flerus

Model WER CER Normalized_WER Normalized_CER Database Split Runtime
int8_bfloat16 43.86 14.33 39.4 14.33 google/fleurs test 126.79
bfloat16 43.39 14.1 38.96 14.11 google/fleurs test 119.79
int8_float32 41.7 12.82 36.95 12.85 google/fleurs test 134.13
int8 41.69 12.84 36.96 12.86 google/fleurs test 136.07
int8_float16 41.66 12.97 36.96 13 google/fleurs test 126.24
float16 41.52 12.9 36.79 12.95 google/fleurs test 117.99

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 7e-05
  • train_batch_size: 64
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Ortho Wer
0.3295 0.2693 1000 0.9198 56.3713 52.8136
0.2606 0.5385 2000 0.8563 51.6035 47.1599
0.2154 0.8078 3000 0.8362 48.4217 45.0160
0.1679 1.0770 4000 0.8385 46.1071 42.3727
0.1612 1.3463 5000 0.8393 45.8802 41.9313
0.1527 1.6155 6000 0.8277 43.4648 39.1404
0.1491 1.8848 7000 0.8253 44.1254 39.7092
0.1146 2.1540 8000 0.8385 42.2041 37.9032
0.1095 2.4233 9000 0.8387 41.7541 37.3534
0.109 2.6925 10000 0.8351 41.3986 37.1251
0.1039 2.9618 11000 0.8324 40.6270 36.4265

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.3.0+cu121
  • Datasets 3.0.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
37.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sarpba/whisper-tiny-hungarian_475

Finetuned
(308)
this model

Evaluation results