CheeLi03's picture
Upload tokenizer
b44ee02 verified
metadata
base_model: openai/whisper-tiny
datasets:
  - fleurs
language:
  - id
library_name: transformers
license: apache-2.0
metrics:
  - wer
tags:
  - hf-asr-leaderboard
  - generated_from_trainer
model-index:
  - name: Whisper Tiny Indonesian - Chee Li
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: Google Fleurs
          type: fleurs
          config: he_il
          split: None
          args: 'config: id split: test'
        metrics:
          - type: wer
            value: 61.51004728132388
            name: Wer

Whisper Tiny Indonesian - Chee Li

This model is a fine-tuned version of openai/whisper-tiny on the Google Fleurs dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1541
  • Wer: 61.5100

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 5000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.3796 4.4643 1000 0.7934 60.0473
0.097 8.9286 2000 0.8975 61.0446
0.0167 13.3929 3000 1.0411 61.1998
0.0057 17.8571 4000 1.1252 62.8694
0.0044 22.3214 5000 1.1541 61.5100

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.20.1