Edit model card

Charcot-Voice_3_0

This model is a fine-tuned version of openai/whisper-large-v3-turbo on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7687
  • Wer: 41.5261

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 9e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 5
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 320
  • total_eval_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2400
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.6063 0.2308 1000 0.9999 62.7936
0.8076 0.4616 2000 1.4315 92.2763
0.8017 0.6924 3000 1.4195 80.7656
0.7362 0.9231 4000 1.3165 69.8551
0.6636 1.1539 5000 1.2294 75.7631
0.6451 1.3847 6000 1.1654 69.4671
0.6148 1.6155 7000 1.1425 71.3658
0.6066 1.8463 8000 1.1058 67.0564
0.5576 2.0771 9000 1.3624 61.4640
0.5486 2.3079 10000 1.0282 61.1123
0.5282 2.5387 11000 1.2816 57.7962
0.5171 2.7694 12000 0.9579 55.6079
0.4925 3.0002 13000 0.9670 64.6249
0.4548 3.2310 14000 0.9398 55.2250
0.4432 3.4618 15000 0.9145 53.3057
0.4364 3.6926 16000 0.9406 57.8427
0.4236 3.9234 17000 1.1792 52.9177
0.385 4.1542 18000 0.8318 50.4759
0.3835 4.3850 19000 0.7850 46.2856
0.3633 4.6157 20000 1.0193 49.7569
0.3593 4.8465 21000 0.9096 50.6932
0.3198 5.0773 22000 0.9480 51.0502
0.3157 5.3081 23000 0.8693 46.8236
0.3033 5.5389 24000 0.7335 44.8112
0.3025 5.7697 25000 0.7442 45.8769
0.2873 6.0005 26000 0.9870 50.2069
0.2735 6.2312 27000 0.9005 47.9617
0.2663 6.4620 28000 0.7609 40.1035
0.2598 6.6928 29000 0.7688 42.9902
0.2549 6.9236 30000 0.7648 42.9126
0.2404 7.1544 31000 0.7478 43.0626
0.2437 7.3852 32000 0.7893 40.1604
0.2451 7.6160 33000 0.7712 41.7848
0.2389 7.8468 34000 0.7687 41.5261

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.4.1+cu124
  • Datasets 3.1.1.dev0
  • Tokenizers 0.20.3
Downloads last month
23
Safetensors
Model size
809M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for voa-engines/Charcot-Voice_3_0

Finetuned
(90)
this model