anzorq
/

mms_finetune_kbd_murat

Inference Endpoints

Model card Files Files and versions Community

anzorq commited on May 13, 2024

Commit

bc83d38

·

verified ·

1 Parent(s): 40f1525

Create README.md

Files changed (1) hide show

README.md +52 -0

README.md ADDED Viewed

	@@ -0,0 +1,52 @@

+---
+license: cc-by-nc-4.0
+language:
+- kbd
+datasets:
+- anzorq/kbd_speech
+pipeline_tag: text-to-speech
+---
+# MMS-TTS Fine-tuned for Kabardian (Speaker: Sokhov Murat)
+This repository contains a fine-tuned version of Facebook's MMS-TTS model, adapted for generating speech in the Kabardian language. The model is trained on a dataset of audio recordings by the speaker Sokhov Murat.
+## Model Details
+- Base Model: [facebook/mms-tts](https://huggingface.co/facebook/mms-tts)
+- Fine-tuned on: [anzorq/kbd_speech](https://huggingface.co/datasets/anzorq/kbd_speech) dataset
+- Speaker: Sokhov Murat
+- Language: Circassian (Kabardian)
+## Usage
+To use this model for text-to-speech generation, you can leverage the `pipeline` functionality from the Transformers library. Here's an example:
+```python
+from transformers import pipeline
+import scipy
+model_id = "anzorq/mms_finetune_kbd_murat"
+synthesiser = pipeline("text-to-speech", model_id, device=0) # add device=0 if you want to use a GPU
+text = "дауэ ущыт?"
+speech = synthesiser(text)
+# Save the generated audio to a file
+scipy.io.wavfile.write("finetuned_output.wav", rate=speech["sampling_rate"], data=speech["audio"][0])
+```
+This code will generate an audio file `finetuned_output.wav` containing the speech synthesis for the provided Kabardian text.
+## Notes
+- Since there is no pre-trained checkpoint for Kabardian in the original MMS-TTS model, a pre-trained checkpoint for a language with the closest character set (Chechen) was used for fine-tuning.
+- This model's performance is considerably worse than that of the fine-tuned VITS model [anzorq/kbd-vits-tts-male](https://huggingface.co/anzorq/kbd-vits-tts-male) for Kabardian text-to-speech.
+## License
+The original MMS-TTS model by Meta is licensed under the CC-BY-NC-4.0 License. This fine-tuned version inherits the same license.
+## Acknowledgments
+- [AI at Meta](https://ai.meta.com//) for the original MMS-TTS model.
+- [Sokhov Murat](https://www.instagram.com/carbatay) for providing the audio recordings used for fine-tuning.