metadata

library_name: transformers
language:
  - vi
license: apache-2.0
base_model: openai/whisper-small
tags:
  - generated_from_trainer
datasets:
  - capleaf/viVoice
metrics:
  - wer
model-index:
  - name: Whisper Small Vi - finetune viVoice - 70000
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: viVoice
          type: capleaf/viVoice
          config: default
          split: test
          args: 'split: train'
        metrics:
          - name: Wer
            type: wer
            value: 14.076664076664077

Whisper Small Vi - finetune viVoice - 70000

This model is a fine-tuned version of openai/whisper-small on the viVoice dataset. It achieves the following results on the evaluation set:

Loss: 5.7260
Wer: 14.0767

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1.25e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 2
total_train_batch_size: 64
total_eval_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
training_steps: 80000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.1892	0.05	4000	3.5308	18.7775
0.1551	0.1	8000	4.2465	18.1171
0.1444	0.15	12000	4.4830	16.9775
0.1097	1.0266	16000	4.4955	16.1357
0.0966	1.0766	20000	4.8873	15.6825
0.0915	1.1266	24000	4.8408	15.6177
0.0853	2.0032	28000	5.0293	15.1904
0.065	2.0532	32000	5.0290	15.8120
0.0644	2.1032	36000	5.1940	14.5299
0.0584	2.1532	40000	5.3418	15.1515
0.0466	3.0298	44000	5.2564	15.2422
0.0405	3.0798	48000	5.4065	14.7112
0.0412	3.1298	52000	5.5395	14.1414
0.0344	4.0064	56000	5.6079	14.5947
0.0288	4.0564	60000	5.5141	14.4911
0.0257	4.1064	64000	5.6983	14.7242
0.0249	4.1564	68000	5.7079	14.0378
0.0209	5.033	72000	5.5744	13.8177
0.0192	5.083	76000	5.7272	14.1803
0.0185	5.133	80000	5.7260	14.0767

Framework versions

Transformers 4.47.1
Pytorch 2.5.1+cu121
Datasets 3.2.0
Tokenizers 0.21.0