Borcherding commited on
Commit
f5263d8
ยท
verified ยท
1 Parent(s): 8ab51e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +112 -106
README.md CHANGED
@@ -1,106 +1,112 @@
1
- ---
2
- license: other
3
- license_name: coqui-public-model-license
4
- license_link: https://coqui.ai/cpml
5
- library_name: coqui
6
- pipeline_tag: text-to-speech
7
- widget:
8
- - text: "Once when I was six years old I saw a magnificent picture"
9
- ---
10
-
11
- # โ“TTS_v2 - C-3PO Fine-Tuned Model
12
-
13
- This repository hosts a fine-tuned version of the โ“TTS model, utilizing 20 unique voice lines from C-3PO, the iconic Star Wars character. The voice lines were sourced from [Voicy](https://www.voicy.network/official-soundboards/movies/c3po).
14
-
15
- ![C-3PO](c3po_1.png)
16
-
17
- Listen to a sample of the โ“TTS_v2 - C-3PO Fine-Tuned Model:
18
-
19
- <audio controls>
20
- <source src="https://huggingface.co/Borcherding/XTTS-v2_C3PO/raw/main/sample_c3po_generated.wav" type="audio/wav">
21
- Your browser does not support the audio element.
22
- </audio>
23
-
24
- Here's a C-3PO mp3 voice line clip from the training data:
25
-
26
- <audio controls>
27
- <source src="https://huggingface.co/Borcherding/XTTS-v2_C3PO/raw/main/reference2.mp3" type="audio/wav">
28
- Your browser does not support the audio element.
29
- </audio>
30
-
31
- ## Features
32
- - ๐ŸŽ™๏ธ **Voice Cloning**: Realistic voice cloning with just a short audio clip.
33
- - ๐ŸŒ **Multi-Lingual Support**: Generates speech in 17 different languages while maintaining C-3PO's distinct voice.
34
- - ๐Ÿ˜ƒ **Emotion & Style Transfer**: Captures the emotional tone and style of the original voice.
35
- - ๐Ÿ”„ **Cross-Language Cloning**: Maintains the unique voice characteristics across different languages.
36
- - ๐ŸŽง **High-Quality Audio**: Outputs at a 24kHz sampling rate for clear and high-fidelity audio.
37
-
38
- ## Supported Languages
39
- The model supports the following 17 languages: English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Turkish (tr), Russian (ru), Dutch (nl), Czech (cs), Arabic (ar), Chinese (zh-cn), Japanese (ja), Hungarian (hu), Korean (ko), and Hindi (hi).
40
-
41
- ## Usage in Roll Cage
42
- ๐Ÿค–๐Ÿ’ฌ Boost your AI experience with this Ollama add-on! Enjoy real-time audio ๐ŸŽ™๏ธ and text ๐Ÿ” chats, LaTeX rendering ๐Ÿ“œ, agent automations โš™๏ธ, workflows ๐Ÿ”„, text-to-image ๐Ÿ“โžก๏ธ๐Ÿ–ผ๏ธ, image-to-text ๐Ÿ–ผ๏ธโžก๏ธ๐Ÿ”ค, image-to-video ๐Ÿ–ผ๏ธโžก๏ธ๐ŸŽฅ transformations. Fine-tune text ๐Ÿ“, voice ๐Ÿ—ฃ๏ธ, and image ๐Ÿ–ผ๏ธ gens. Includes Windows macro controls ๐Ÿ–ฅ๏ธ and DuckDuckGo search.
43
-
44
- [ollama_agent_roll_cage (OARC)](https://github.com/Leoleojames1/ollama_agent_roll_cage) is a completely local Python & CMD toolset add-on for the Ollama command line interface. The OARC toolset automates the creation of agents, giving the user more control over the likely output. It provides SYSTEM prompt templates for each ./Modelfile, allowing users to design and deploy custom agents quickly. Users can select which local model file is used in agent construction with the desired system prompt.
45
-
46
- ## Why This Model for Roll Cage?
47
- The C-3PO fine-tuned model was designed for the Roll Cage chatbot to enhance user interaction with a familiar and beloved voice. By incorporating C-3PO's distinctive speech patterns and tone, Roll Cage becomes more engaging and entertaining. The addition of multi-lingual support and emotion transfer ensures that the chatbot can communicate effectively and expressively across different languages and contexts, providing a more immersive experience for users.
48
-
49
- ## CoquiTTS and Resources
50
- - ๐Ÿธ๐Ÿ’ฌ **CoquiTTS**: [Coqui TTS on GitHub](https://github.com/coqui-ai/TTS)
51
- - ๐Ÿ“š **Documentation**: [ReadTheDocs](https://tts.readthedocs.io/en/latest/)
52
- - ๐Ÿ‘ฉโ€๐Ÿ’ป **Questions**: [GitHub Discussions](https://github.com/coqui-ai/TTS/discussions)
53
- - ๐Ÿ—ฏ **Community**: [Discord](https://discord.gg/5eXr5seRrv)
54
-
55
- ## License
56
- This model is licensed under the [Coqui Public Model License](https://coqui.ai/cpml). Read more about the origin story of CPML [here](https://coqui.ai/blog/tts/cpml).
57
-
58
- ## Contact
59
- Join our ๐ŸธCommunity on [Discord](https://discord.gg/fBC58unbKE) and follow us on [Twitter](https://twitter.com/coqui_ai). For inquiries, email us at info@coqui.ai.
60
-
61
- Using ๐ŸธTTS API:
62
-
63
- ```python
64
- from TTS.api import TTS
65
-
66
- tts = TTS(model_path="D:/CodingGit_StorageHDD/Ollama_Custom_Mods/ollama_agent_roll_cage/AgentFiles/Ignored_TTS/XTTS-v2_C3PO/",
67
- config_path="D:/CodingGit_StorageHDD/Ollama_Custom_Mods/ollama_agent_roll_cage/AgentFiles/Ignored_TTS/XTTS-v2_C3PO/config.json", progress_bar=False, gpu=True).to(self.device)
68
-
69
- # generate speech by cloning a voice using default settings
70
- tts.tts_to_file(text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
71
- file_path="output.wav",
72
- speaker_wav="/path/to/target/speaker.wav",
73
- language="en")
74
-
75
- ```
76
-
77
- Using ๐ŸธTTS Command line:
78
-
79
- ```console
80
- tts --model_name tts_models/multilingual/multi-dataset/xtts_v2 \
81
- --text "Bugรผn okula gitmek istemiyorum." \
82
- --speaker_wav /path/to/target/speaker.wav \
83
- --language_idx tr \
84
- --use_cuda true
85
- ```
86
-
87
- Using the model directly:
88
-
89
- ```python
90
- from TTS.tts.configs.xtts_config import XttsConfig
91
- from TTS.tts.models.xtts import Xtts
92
-
93
- config = XttsConfig()
94
- config.load_json("/path/to/xtts/config.json")
95
- model = Xtts.init_from_config(config)
96
- model.load_checkpoint(config, checkpoint_dir="/path/to/xtts/", eval=True)
97
- model.cuda()
98
-
99
- outputs = model.synthesize(
100
- "It took me quite a long time to develop a voice and now that I have it I am not going to be silent.",
101
- config,
102
- speaker_wav="/data/TTS-public/_refclips/3.wav",
103
- gpt_cond_len=3,
104
- language="en",
105
- )
106
- ```
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: coqui-public-model-license
4
+ license_link: https://coqui.ai/cpml
5
+ library_name: coqui
6
+ pipeline_tag: text-to-speech
7
+ widget:
8
+ - text: "Once when I was six years old I saw a magnificent picture"
9
+ ---
10
+
11
+ # โ“TTS_v2 - C-3PO Fine-Tuned Voice Model (Borcherding/XTTS-v2_C3PO)
12
+ Artistic Whimsy and Galactic Musings
13
+ The โ“TTS (Satirical Text-to-Speech) model, residing within the Borcherding/XTTS-v2_C3PO repository, transcends mere technology. It becomes an art pieceโ€”an interplay of code, creativity, and humor. Imagine a digital gallery where visitors encounter C-3POโ€™s satirical musings echoing through the virtual halls.
14
+
15
+ Key Features
16
+ C-3POโ€™s Quirky Voice: Leveraging 20 unique voice lines sourced from Voicy, the โ“TTS model captures the essence of C-3POโ€™s distinctive speech patterns. Expect a delightful blend of protocol droid formality, unexpected commentary, and occasional existential musings.
17
+ Satirical Tone: Rather than adhering to a neutral or serious tone, the โ“TTS model revels in satire. It playfully exaggerates intonation, injects humorous pauses, and occasionally breaks the fourth wall. Each voice line becomes a brushstroke on the canvas of imagination.
18
+
19
+ This repository hosts a fine-tuned version of the โ“TTS model, utilizing 20 unique voice lines from C-3PO, the iconic Star Wars character. The voice lines were sourced from [Voicy](https://www.voicy.network/official-soundboards/movies/c3po).
20
+
21
+ ![C-3PO](c3po_1.png)
22
+
23
+ Listen to a sample of the โ“TTS_v2 - C-3PO Fine-Tuned Model:
24
+
25
+ <audio controls>
26
+ <source src="https://huggingface.co/Borcherding/XTTS-v2_C3PO/raw/main/sample_c3po_generated.wav" type="audio/wav">
27
+ Your browser does not support the audio element.
28
+ </audio>
29
+
30
+ Here's a C-3PO mp3 voice line clip from the training data:
31
+
32
+ <audio controls>
33
+ <source src="https://huggingface.co/Borcherding/XTTS-v2_C3PO/raw/main/reference2.mp3" type="audio/wav">
34
+ Your browser does not support the audio element.
35
+ </audio>
36
+
37
+ ## Features
38
+ - ๐ŸŽ™๏ธ **Voice Cloning**: Realistic voice cloning with just a short audio clip.
39
+ - ๐ŸŒ **Multi-Lingual Support**: Generates speech in 17 different languages while maintaining C-3PO's distinct voice.
40
+ - ๐Ÿ˜ƒ **Emotion & Style Transfer**: Captures the emotional tone and style of the original voice.
41
+ - ๐Ÿ”„ **Cross-Language Cloning**: Maintains the unique voice characteristics across different languages.
42
+ - ๐ŸŽง **High-Quality Audio**: Outputs at a 24kHz sampling rate for clear and high-fidelity audio.
43
+
44
+ ## Supported Languages
45
+ The model supports the following 17 languages: English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Turkish (tr), Russian (ru), Dutch (nl), Czech (cs), Arabic (ar), Chinese (zh-cn), Japanese (ja), Hungarian (hu), Korean (ko), and Hindi (hi).
46
+
47
+ ## Usage in Roll Cage
48
+ ๐Ÿค–๐Ÿ’ฌ Boost your AI experience with this Ollama add-on! Enjoy real-time audio ๐ŸŽ™๏ธ and text ๐Ÿ” chats, LaTeX rendering ๐Ÿ“œ, agent automations โš™๏ธ, workflows ๐Ÿ”„, text-to-image ๐Ÿ“โžก๏ธ๐Ÿ–ผ๏ธ, image-to-text ๐Ÿ–ผ๏ธโžก๏ธ๐Ÿ”ค, image-to-video ๐Ÿ–ผ๏ธโžก๏ธ๐ŸŽฅ transformations. Fine-tune text ๐Ÿ“, voice ๐Ÿ—ฃ๏ธ, and image ๐Ÿ–ผ๏ธ gens. Includes Windows macro controls ๐Ÿ–ฅ๏ธ and DuckDuckGo search.
49
+
50
+ [ollama_agent_roll_cage (OARC)](https://github.com/Leoleojames1/ollama_agent_roll_cage) is a completely local Python & CMD toolset add-on for the Ollama command line interface. The OARC toolset automates the creation of agents, giving the user more control over the likely output. It provides SYSTEM prompt templates for each ./Modelfile, allowing users to design and deploy custom agents quickly. Users can select which local model file is used in agent construction with the desired system prompt.
51
+
52
+ ## Why This Model for Roll Cage?
53
+ The C-3PO fine-tuned model was designed for the Roll Cage chatbot to enhance user interaction with a familiar and beloved voice. By incorporating C-3PO's distinctive speech patterns and tone, Roll Cage becomes more engaging and entertaining. The addition of multi-lingual support and emotion transfer ensures that the chatbot can communicate effectively and expressively across different languages and contexts, providing a more immersive experience for users.
54
+
55
+ ## CoquiTTS and Resources
56
+ - ๐Ÿธ๐Ÿ’ฌ **CoquiTTS**: [Coqui TTS on GitHub](https://github.com/coqui-ai/TTS)
57
+ - ๐Ÿ“š **Documentation**: [ReadTheDocs](https://tts.readthedocs.io/en/latest/)
58
+ - ๐Ÿ‘ฉโ€๐Ÿ’ป **Questions**: [GitHub Discussions](https://github.com/coqui-ai/TTS/discussions)
59
+ - ๐Ÿ—ฏ **Community**: [Discord](https://discord.gg/5eXr5seRrv)
60
+
61
+ ## License
62
+ This model is licensed under the [Coqui Public Model License](https://coqui.ai/cpml). Read more about the origin story of CPML [here](https://coqui.ai/blog/tts/cpml).
63
+
64
+ ## Contact
65
+ Join our ๐ŸธCommunity on [Discord](https://discord.gg/fBC58unbKE) and follow us on [Twitter](https://twitter.com/coqui_ai). For inquiries, email us at info@coqui.ai.
66
+
67
+ Using ๐ŸธTTS API:
68
+
69
+ ```python
70
+ from TTS.api import TTS
71
+
72
+ tts = TTS(model_path="D:/CodingGit_StorageHDD/Ollama_Custom_Mods/ollama_agent_roll_cage/AgentFiles/Ignored_TTS/XTTS-v2_C3PO/",
73
+ config_path="D:/CodingGit_StorageHDD/Ollama_Custom_Mods/ollama_agent_roll_cage/AgentFiles/Ignored_TTS/XTTS-v2_C3PO/config.json", progress_bar=False, gpu=True).to(self.device)
74
+
75
+ # generate speech by cloning a voice using default settings
76
+ tts.tts_to_file(text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
77
+ file_path="output.wav",
78
+ speaker_wav="/path/to/target/speaker.wav",
79
+ language="en")
80
+
81
+ ```
82
+
83
+ Using ๐ŸธTTS Command line:
84
+
85
+ ```console
86
+ tts --model_name tts_models/multilingual/multi-dataset/xtts_v2 \
87
+ --text "Bugรผn okula gitmek istemiyorum." \
88
+ --speaker_wav /path/to/target/speaker.wav \
89
+ --language_idx tr \
90
+ --use_cuda true
91
+ ```
92
+
93
+ Using the model directly:
94
+
95
+ ```python
96
+ from TTS.tts.configs.xtts_config import XttsConfig
97
+ from TTS.tts.models.xtts import Xtts
98
+
99
+ config = XttsConfig()
100
+ config.load_json("/path/to/xtts/config.json")
101
+ model = Xtts.init_from_config(config)
102
+ model.load_checkpoint(config, checkpoint_dir="/path/to/xtts/", eval=True)
103
+ model.cuda()
104
+
105
+ outputs = model.synthesize(
106
+ "It took me quite a long time to develop a voice and now that I have it I am not going to be silent.",
107
+ config,
108
+ speaker_wav="/data/TTS-public/_refclips/3.wav",
109
+ gpt_cond_len=3,
110
+ language="en",
111
+ )
112
+ ```