|
--- |
|
license: openrail |
|
pipeline_tag: audio-to-audio |
|
tags: |
|
- pretrained |
|
- RVC |
|
- ai |
|
- voice-cloning |
|
- voice-conversion |
|
- Voice2Voice |
|
--- |
|
|
|
# DMR: Deep and Soft Voice Improvement Pretrain for RVC |
|
|
|
## Model description |
|
|
|
DMR is a pretrain designed to improve soft and deep voices. |
|
|
|
## Intended uses |
|
|
|
- Improve voice conversion quality, especially for soft and deep voices |
|
- Enhance breathing sounds (and eventually whispering) in voice conversion |
|
- Make better E-girl and Mommy voices :3 |
|
|
|
## Training data |
|
|
|
The model was trained on a custom dataset with the following details: |
|
- Total duration: 11.3 hours |
|
- Language: English |
|
- Number of speakers: 22 |
|
- 16 female speakers |
|
- 6 male speakers |
|
|
|
## Training Process |
|
|
|
DMR was trained with Applio using a RTX 4060TI 16gb. |
|
- BatchSize: 8 |
|
- Pitch Extraction Method: Mangio-Crepe |
|
- Hop Length: 32 |
|
- Sample Rate: 32K |
|
|
|
## Usage |
|
|
|
To use the DMR pretrain: |
|
|
|
1. Download both the D and G files of the DMR model. |
|
2. For standard RVC setup: |
|
- Place the downloaded files in the `pretrained_v2` folder. |
|
3. For Applio users: |
|
- Place the downloaded files in the `custom pretrains` folder. |
|
|
|
## Additional Information |
|
|
|
I do plan to make a V2 of DMR with around 30 hours of speech using either BIGVGAN V2 or EVAGAN but I do not have a release date. |