SpeechT5-Hausa-9 / README.md
Judah04's picture
End of training
d94f2a0 verified
metadata
library_name: transformers
license: mit
base_model: microsoft/speecht5_tts
tags:
  - generated_from_trainer
datasets:
  - common_voice_17_0
model-index:
  - name: SpeechT5-Hausa-9
    results: []

SpeechT5-Hausa-9

This model is a fine-tuned version of microsoft/speecht5_tts on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6525

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 2000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.6188 1.6598 100 0.6391
0.5578 3.3195 200 0.6273
0.5346 4.9793 300 0.6454
0.5193 6.6390 400 0.6131
0.5011 8.2988 500 0.6113
0.5069 9.9585 600 0.6259
0.495 11.6183 700 0.6292
0.4835 13.2780 800 0.6238
0.4795 14.9378 900 0.6300
0.4747 16.5975 1000 0.6222
0.4746 18.2573 1100 0.6387
0.4683 19.9170 1200 0.6220
0.4591 21.5768 1300 0.6474
0.4593 23.2365 1400 0.6548
0.4567 24.8963 1500 0.6322
0.4529 26.5560 1600 0.6476
0.4495 28.2158 1700 0.6517
0.4477 29.8755 1800 0.6397
0.442 31.5353 1900 0.6557
0.4412 33.1950 2000 0.6525

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.19.1