idajikuu commited on
Commit
df4c8ff
1 Parent(s): 9655383

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ht
4
+ pipeline_tag: text-to-speech
5
+ tags:
6
+ - 'speecht5 '
7
+ - TTS
8
+ ---
9
+
10
+
11
+ # Fine-tuned SpeechT5 TTS Model for Haitian Creole
12
+
13
+ This model is a fine-tuned version of [microsoft/speecht5-tts](https://huggingface.co/microsoft/speecht5-tts) for the Haitian Creole language. It was fine-tuned on the CMU Haitian dataset.
14
+
15
+ ## Model Description
16
+
17
+ The model is based on the SpeechT5 architecture, which is a variant of the T5 (Text-to-Text Transfer Transformer) model designed specifically for text-to-speech tasks. The model is capable of converting input text in Haitian Creole into corresponding speech.
18
+
19
+ ## Intended Uses & Limitations
20
+
21
+ The model is intended for text-to-speech (TTS) applications in Haitian Creole language processing. It can be used for generating speech from written text, enabling applications such as audiobook narration, voice assistants, and more.
22
+
23
+ However, there are some limitations to be aware of:
24
+ - The model's performance heavily depends on the quality and diversity of the training data. Fine-tuning on more diverse and specific datasets might improve its performance.
25
+ - Like all machine learning models, this model may produce inaccuracies or errors in speech synthesis, especially for complex sentences or domain-specific jargon.
26
+
27
+ ## Training and Evaluation Data
28
+
29
+ The model was fine-tuned on the CMU Haitian dataset, which contains text and corresponding audio samples in Haitian Creole. The dataset was split into training and evaluation sets to assess the model's performance.
30
+
31
+ ## Training Procedure
32
+
33
+ ### Training Hyperparameters
34
+
35
+ The following hyperparameters were used during training:
36
+ - learning_rate: 1e-05
37
+ - per_device_train_batch_size: 16
38
+ - gradient_accumulation_steps: 2
39
+ - warmup_steps: 500
40
+ - max_steps: 4000
41
+ - gradient_checkpointing: True
42
+ - fp16: True
43
+ - evaluation_strategy: no
44
+ - per_device_eval_batch_size: 8
45
+ - save_steps: 1000
46
+ - logging_steps: 25
47
+ - report_to: ["tensorboard"]
48
+ - greater_is_better: False
49
+
50
+ ### Training Results
51
+
52
+ The training progress and evaluation results are as follows:
53
+
54
+ | Training Loss | Epoch | Step | Validation Loss |
55
+ |:-------------:|:-----:|:----:|:---------------:|
56
+ | 0.5147 | 2.42 | 1000 | 0.4753 |
57
+ | 0.4932 | 4.84 | 2000 | 0.4629 |
58
+ | 0.4926 | 7.26 | 3000 | 0.4566 |
59
+ | 0.4907 | 9.69 | 4000 | 0.4542 |
60
+ | 0.4839 | 12.11 | 5000 | 0.4532 |
61
+
62
+ ### Training Output
63
+
64
+ The training was completed with the following output:
65
+ - Global Step: 4000
66
+ - Training Loss: 0.3344
67
+ - Training Runtime: 7123.63 seconds
68
+ - Training Samples per Second: 17.97
69
+ - Training Steps per Second: 0.562
70
+ - Total FLOPs: 1.1690e+16
71
+
72
+ ## Framework Versions
73
+
74
+ - Transformers 4.31.0
75
+ - PyTorch 2.0.1+cu118
76
+ - Datasets 2.13.1
77
+ - Tokenizers 0.13.3