whisper-small-da / README.md
WasuratS's picture
Update README.md
eef969c
|
raw
history blame
3.21 kB
metadata
language:
  - da
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_13_0
metrics:
  - wer
model-index:
  - name: Whisper Small Da - WasuratS
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 13
          type: mozilla-foundation/common_voice_13_0
          config: da
          split: test
          args: da
        metrics:
          - name: Wer
            type: wer
            value: 23.39882224190943

Whisper Small Da - WasuratS

This model is a fine-tuned version of openai/whisper-small on the Common Voice 13 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6393
  • Wer Ortho: 29.0926
  • Wer: 23.3988

Model description

openai/whisper-small

Training and evaluation data

mozilla-foundation/common_voice_13_0

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • training_steps: 4000
  • mixed_precision_training: Native AMP
from transformers import Seq2SeqTrainingArguments

training_args = Seq2SeqTrainingArguments(
    output_dir="./whisper-small-da", 
    per_device_train_batch_size=16,
    gradient_accumulation_steps=1,  
    learning_rate=1e-5,
    lr_scheduler_type="linear",
    warmup_steps=50,
    max_steps=4000,  
    gradient_checkpointing=True,
    fp16=True,
    fp16_full_eval=True,
    evaluation_strategy="steps",
    per_device_eval_batch_size=16,
    predict_with_generate=True,
    generation_max_length=225,
    save_steps=500,
    eval_steps=500,
    logging_steps=25,
    report_to=["tensorboard"],
    load_best_model_at_end=True,
    metric_for_best_model="wer",
    greater_is_better=False,
    push_to_hub=True,
)

Training results

Training Loss Epoch Step Validation Loss Wer Ortho Wer
0.218 1.61 500 0.4724 30.2496 24.7069
0.0628 3.22 1000 0.4825 28.8946 23.3154
0.0289 4.82 1500 0.5311 29.3376 23.4666
0.0078 6.43 2000 0.5740 29.4627 23.6542
0.0032 8.04 2500 0.6070 29.0613 23.2790
0.0025 9.65 3000 0.6274 29.1187 23.4770
0.0012 11.25 3500 0.6335 29.0978 23.3623
0.0011 12.86 4000 0.6393 29.0926 23.3988

Framework versions

  • Transformers 4.29.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.12.0
  • Tokenizers 0.13.3