tz579's picture
Training in progress, step 12776
6e42c7f verified
metadata
license: apache-2.0
base_model: facebook/wav2vec2-base
tags:
  - automatic-speech-recognition
  - timit_asr
  - generated_from_trainer
datasets:
  - timit_asr
metrics:
  - wer
model-index:
  - name: wav2vec2-base-timit-fine-tuned
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: TIMIT_ASR - NA
          type: timit_asr
          config: clean
          split: test
          args: 'Config: na, Training split: train, Eval split: test'
        metrics:
          - name: Wer
            type: wer
            value: 0.4090867704634435

wav2vec2-base-timit-fine-tuned

This model is a fine-tuned version of facebook/wav2vec2-base on the TIMIT_ASR - NA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4218
  • Wer: 0.4091

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
3.1612 0.8621 100 3.1181 1.0
2.978 1.7241 200 2.9722 1.0
2.9185 2.5862 300 2.9098 1.0
2.1282 3.4483 400 2.0066 1.0247
1.1234 4.3103 500 1.0197 0.8393
0.602 5.1724 600 0.6714 0.6600
0.5032 6.0345 700 0.5285 0.5659
0.3101 6.8966 800 0.4819 0.5282
0.3432 7.7586 900 0.4653 0.5272
0.1922 8.6207 1000 0.4672 0.4918
0.2284 9.4828 1100 0.4834 0.4870
0.1372 10.3448 1200 0.4380 0.4727
0.1105 11.2069 1300 0.4509 0.4594
0.0992 12.0690 1400 0.4196 0.4544
0.1226 12.9310 1500 0.4237 0.4321
0.1013 13.7931 1600 0.4113 0.4298
0.0661 14.6552 1700 0.4038 0.4276
0.0901 15.5172 1800 0.4321 0.4225
0.053 16.3793 1900 0.4076 0.4236
0.0805 17.2414 2000 0.4336 0.4156
0.049 18.1034 2100 0.4193 0.4114
0.0717 18.9655 2200 0.4139 0.4091
0.0389 19.8276 2300 0.4216 0.4087

Framework versions

  • Transformers 4.42.0.dev0
  • Pytorch 2.3.0a0+git71dd2de
  • Datasets 2.19.1
  • Tokenizers 0.19.1