Oysiyl
/

speecht5_tts_common_voice_uk

@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the common_voice_16_1 dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5430
 ## Model description
@@ -37,24 +37,29 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
-- train_batch_size: 64
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 500
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.8074        | 1.0   | 422  | 0.5961          |
-| 0.571         | 2.0   | 844  | 0.5699          |
-| 0.5418        | 2.99  | 1266 | 0.5517          |
-| 0.5241        | 3.99  | 1688 | 0.5487          |
-| 0.5105        | 4.99  | 2110 | 0.5430          |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the common_voice_16_1 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4211
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
+- train_batch_size: 32
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.8081        | 0.99  | 130  | 0.5394          |
+| 0.5563        | 1.98  | 260  | 0.4533          |
+| 0.5041        | 2.98  | 390  | 0.4468          |
+| 0.4962        | 3.97  | 520  | 0.4628          |
+| 0.495         | 4.96  | 650  | 0.4335          |
+| 0.4777        | 5.95  | 780  | 0.4259          |
+| 0.4752        | 6.95  | 910  | 0.4234          |
+| 0.4673        | 7.94  | 1040 | 0.4229          |
+| 0.4741        | 8.93  | 1170 | 0.4232          |
+| 0.4666        | 9.92  | 1300 | 0.4211          |
 ### Framework versions

config.json CHANGED Viewed

@@ -86,7 +86,7 @@
   "speech_decoder_prenet_units": 256,
   "torch_dtype": "float32",
   "transformers_version": "4.37.2",
-  "use_cache": false,
   "use_guided_attention_loss": true,
   "vocab_size": 81
 }

   "speech_decoder_prenet_units": 256,
   "torch_dtype": "float32",
   "transformers_version": "4.37.2",
+  "use_cache": true,
   "use_guided_attention_loss": true,
   "vocab_size": 81
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b0b4edbd24676ad7243a68bb623141090e3a7c7dd229f565844ecfaeb96577b5
 size 577789320

 version https://git-lfs.github.com/spec/v1
+oid sha256:f15792b548370ef392939922942049805149b36822d05642a58a68de3af6f9a8
 size 577789320

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1ae5808f1c706a390a93d2838f84685363d3126a61cb8af9f463e11e5bfaa775
 size 4399

 version https://git-lfs.github.com/spec/v1
+oid sha256:a86284ad5f8409a30c10d95d03c6cc21ad42c811bb88f14b26776c223e98656d
 size 4399