Edit model card

speech_ocean_wav2vec_mdd

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3663
  • Wer: 0.0863
  • Cer: 0.0692

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
45.149 0.9873 39 45.0584 1.0258 0.7932
40.7325 2.0 79 32.0660 1.0 1.0
14.8164 2.9873 118 8.1694 1.0 1.0
5.6535 4.0 158 4.5922 1.0 1.0
3.9508 4.9873 197 3.8581 1.0 1.0
3.8065 6.0 237 3.7907 1.0 1.0
3.766 6.9873 276 3.7769 1.0 1.0
3.7552 8.0 316 3.7465 1.0 1.0
3.7489 8.9873 355 3.7611 1.0 1.0
3.7263 10.0 395 3.7234 1.0 1.0
3.7343 10.9873 434 3.6934 1.0 1.0
3.6327 12.0 474 3.4204 1.0 1.0
3.1861 12.9873 513 2.7907 0.9710 0.9864
2.2814 14.0 553 1.7142 0.5088 0.5401
1.6854 14.9873 592 1.0573 0.2488 0.1914
1.2968 16.0 632 0.7282 0.1786 0.1391
0.8626 16.9873 671 0.5435 0.1305 0.0999
0.7852 18.0 711 0.4440 0.1046 0.0831
0.6332 18.9873 750 0.3847 0.0936 0.0748
0.6518 19.7468 780 0.3663 0.0863 0.0692

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
12
Safetensors
Model size
316M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nrshoudi/speech_ocean_wav2vec_mdd

Finetuned
(206)
this model