Edit model card

SpeechT5-it

This model is a fine-tuned version of microsoft/speecht5_tts on the VOXPOPULI dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4600

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
0.5641 1.0 712 0.5090
0.5394 2.0 1424 0.4915
0.5277 3.0 2136 0.4819
0.5136 4.0 2848 0.4798
0.5109 5.0 3560 0.4733
0.5078 6.0 4272 0.4731
0.5033 7.0 4984 0.4692
0.5021 8.0 5696 0.4691
0.4984 9.0 6408 0.4670
0.488 10.0 7120 0.4641
0.491 11.0 7832 0.4641
0.4918 12.0 8544 0.4647
0.4933 13.0 9256 0.4622
0.499 14.0 9968 0.4619
0.4906 15.0 10680 0.4608
0.4884 16.0 11392 0.4622
0.4847 17.0 12104 0.4616
0.4916 18.0 12816 0.4592
0.4845 19.0 13528 0.4600
0.4788 20.0 14240 0.4594
0.4746 21.0 14952 0.4607
0.4875 22.0 15664 0.4615
0.4831 23.0 16376 0.4597
0.4798 24.0 17088 0.4595
0.4727 25.0 17800 0.4592
0.4736 26.0 18512 0.4598
0.4746 27.0 19224 0.4608
0.4728 28.0 19936 0.4589
0.4771 29.0 20648 0.4593
0.4743 30.0 21360 0.4588
0.4785 31.0 22072 0.4601
0.4757 32.0 22784 0.4597
0.4731 33.0 23496 0.4598
0.4746 34.0 24208 0.4593
0.4715 35.0 24920 0.4599
0.4769 36.0 25632 0.4622
0.4778 37.0 26344 0.4605
0.4798 38.0 27056 0.4594
0.4694 39.0 27768 0.4607
0.468 40.0 28480 0.4600

Framework versions

  • Transformers 4.30.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.13.1
  • Tokenizers 0.13.3
Downloads last month
86
Safetensors
Model size
146M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for Vinay15/speecht5_finetuned_voxpopuli_it

Finetuned
(781)
this model

Dataset used to train Vinay15/speecht5_finetuned_voxpopuli_it

Space using Vinay15/speecht5_finetuned_voxpopuli_it 1

Evaluation results