metadata

library_name: transformers
license: mit
base_model: microsoft/speecht5_tts
tags:
  - generated_from_trainer
datasets:
  - common_voice_17_0
model-index:
  - name: SpeechT5-Hausa-9
    results: []

SpeechT5-Hausa-9

This model is a fine-tuned version of microsoft/speecht5_tts on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.6525

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 2000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.6188	1.6598	100	0.6391
0.5578	3.3195	200	0.6273
0.5346	4.9793	300	0.6454
0.5193	6.6390	400	0.6131
0.5011	8.2988	500	0.6113
0.5069	9.9585	600	0.6259
0.495	11.6183	700	0.6292
0.4835	13.2780	800	0.6238
0.4795	14.9378	900	0.6300
0.4747	16.5975	1000	0.6222
0.4746	18.2573	1100	0.6387
0.4683	19.9170	1200	0.6220
0.4591	21.5768	1300	0.6474
0.4593	23.2365	1400	0.6548
0.4567	24.8963	1500	0.6322
0.4529	26.5560	1600	0.6476
0.4495	28.2158	1700	0.6517
0.4477	29.8755	1800	0.6397
0.442	31.5353	1900	0.6557
0.4412	33.1950	2000	0.6525

Framework versions

Transformers 4.44.2
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.19.1