Razer112
/

DMR_Pretrain

voice-conversion

Model card Files Files and versions Community

DMR_Pretrain / README.md

Razer112's picture

Update README.md

41ad9f8 verified 2 months ago

|

1.28 kB

metadata

license: openrail
pipeline_tag: audio-to-audio
tags:
  - pretrained
  - RVC
  - ai
  - voice-cloning
  - voice-conversion
  - Voice2Voice

DMR: Deep and Soft Voice Improvement Pretrain for RVC

Model description

DMR is a pretrain designed to improve soft and deep voices.

Intended uses

Improve voice conversion quality, especially for soft and deep voices
Enhance breathing sounds (and eventually whispering) in voice conversion
Make better E-girl and Mommy voices :3

Training data

The model was trained on a custom dataset with the following details:

Total duration: 11.3 hours
Language: English
Number of speakers: 22
- 16 female speakers
- 6 male speakers

Training Process

DMR was trained with Applio using a RTX 4060TI 16gb.

BatchSize: 8
Pitch Extraction Method: Mangio-Crepe
Hop Length: 32
Sample Rate: 32K

Usage

To use the DMR pretrain:

Download both the D and G files of the DMR model.
For standard RVC setup:
- Place the downloaded files in the pretrained_v2 folder.
For Applio users:
- Place the downloaded files in the custom pretrains folder.

Additional Information

I do plan to make a V2 of DMR with around 30 hours of speech using either BIGVGAN V2 or EVAGAN but I do not have a release date.