anzorq commited on
Commit
bc83d38
·
verified ·
1 Parent(s): 40f1525

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - kbd
5
+ datasets:
6
+ - anzorq/kbd_speech
7
+ pipeline_tag: text-to-speech
8
+ ---
9
+ # MMS-TTS Fine-tuned for Kabardian (Speaker: Sokhov Murat)
10
+
11
+ This repository contains a fine-tuned version of Facebook's MMS-TTS model, adapted for generating speech in the Kabardian language. The model is trained on a dataset of audio recordings by the speaker Sokhov Murat.
12
+
13
+ ## Model Details
14
+
15
+ - Base Model: [facebook/mms-tts](https://huggingface.co/facebook/mms-tts)
16
+ - Fine-tuned on: [anzorq/kbd_speech](https://huggingface.co/datasets/anzorq/kbd_speech) dataset
17
+ - Speaker: Sokhov Murat
18
+ - Language: Circassian (Kabardian)
19
+
20
+ ## Usage
21
+
22
+ To use this model for text-to-speech generation, you can leverage the `pipeline` functionality from the Transformers library. Here's an example:
23
+
24
+ ```python
25
+ from transformers import pipeline
26
+ import scipy
27
+
28
+ model_id = "anzorq/mms_finetune_kbd_murat"
29
+ synthesiser = pipeline("text-to-speech", model_id, device=0) # add device=0 if you want to use a GPU
30
+
31
+ text = "дауэ ущыт?"
32
+ speech = synthesiser(text)
33
+
34
+ # Save the generated audio to a file
35
+ scipy.io.wavfile.write("finetuned_output.wav", rate=speech["sampling_rate"], data=speech["audio"][0])
36
+ ```
37
+
38
+ This code will generate an audio file `finetuned_output.wav` containing the speech synthesis for the provided Kabardian text.
39
+
40
+ ## Notes
41
+
42
+ - Since there is no pre-trained checkpoint for Kabardian in the original MMS-TTS model, a pre-trained checkpoint for a language with the closest character set (Chechen) was used for fine-tuning.
43
+ - This model's performance is considerably worse than that of the fine-tuned VITS model [anzorq/kbd-vits-tts-male](https://huggingface.co/anzorq/kbd-vits-tts-male) for Kabardian text-to-speech.
44
+
45
+ ## License
46
+
47
+ The original MMS-TTS model by Meta is licensed under the CC-BY-NC-4.0 License. This fine-tuned version inherits the same license.
48
+
49
+ ## Acknowledgments
50
+
51
+ - [AI at Meta](https://ai.meta.com//) for the original MMS-TTS model.
52
+ - [Sokhov Murat](https://www.instagram.com/carbatay) for providing the audio recordings used for fine-tuning.