Edit model card

speecht5_finetuned_fleurs_zh_4000

This model is a fine-tuned version of microsoft/speecht5_tts on the fleurs dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3888

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000

Training results

Training Loss Epoch Step Validation Loss
0.7366 1.09 100 0.6059
0.5892 2.19 200 0.5104
0.5436 3.28 300 0.4585
0.4848 4.38 400 0.4333
0.4733 5.47 500 0.4276
0.4534 6.57 600 0.4194
0.454 7.66 700 0.4172
0.4489 8.76 800 0.4111
0.4401 9.85 900 0.4108
0.441 10.94 1000 0.4136
0.437 12.04 1100 0.4078
0.4333 13.13 1200 0.4067
0.4328 14.23 1300 0.4002
0.4289 15.32 1400 0.4015
0.4254 16.42 1500 0.4012
0.427 17.51 1600 0.4020
0.4273 18.6 1700 0.4008
0.4222 19.7 1800 0.3966
0.4305 20.79 1900 0.3998
0.4198 21.89 2000 0.3954
0.4225 22.98 2100 0.3961
0.4223 24.08 2200 0.3965
0.4201 25.17 2300 0.3922
0.4234 26.27 2400 0.3939
0.4213 27.36 2500 0.3930
0.4182 28.45 2600 0.3934
0.4119 29.55 2700 0.3925
0.4113 30.64 2800 0.3907
0.4131 31.74 2900 0.3907
0.4135 32.83 3000 0.3933
0.4142 33.93 3100 0.3909
0.4144 35.02 3200 0.3919
0.414 36.11 3300 0.3919
0.418 37.21 3400 0.3899
0.4094 38.3 3500 0.3897
0.4149 39.4 3600 0.3924
0.4105 40.49 3700 0.3905
0.413 41.59 3800 0.3895
0.4117 42.68 3900 0.3900
0.4096 43.78 4000 0.3888

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
4
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from