Razer112
/

DMR_Pretrain

voice-conversion

Model card Files Files and versions Community

DMR_Pretrain / README.md

Razer112's picture

Update README.md

41ad9f8 verified 2 months ago

|

1.28 kB

	---
	license: openrail
	pipeline_tag: audio-to-audio
	tags:
	- pretrained
	- RVC
	- ai
	- voice-cloning
	- voice-conversion
	- Voice2Voice
	---

	# DMR: Deep and Soft Voice Improvement Pretrain for RVC

	## Model description

	DMR is a pretrain designed to improve soft and deep voices.

	## Intended uses

	- Improve voice conversion quality, especially for soft and deep voices
	- Enhance breathing sounds (and eventually whispering) in voice conversion
	- Make better E-girl and Mommy voices :3

	## Training data

	The model was trained on a custom dataset with the following details:
	- Total duration: 11.3 hours
	- Language: English
	- Number of speakers: 22
	- 16 female speakers
	- 6 male speakers

	## Training Process

	DMR was trained with Applio using a RTX 4060TI 16gb.
	- BatchSize: 8
	- Pitch Extraction Method: Mangio-Crepe
	- Hop Length: 32
	- Sample Rate: 32K

	## Usage

	To use the DMR pretrain:

	1. Download both the D and G files of the DMR model.
	2. For standard RVC setup:
	- Place the downloaded files in the `pretrained_v2` folder.
	3. For Applio users:
	- Place the downloaded files in the `custom pretrains` folder.

	## Additional Information

	I do plan to make a V2 of DMR with around 30 hours of speech using either BIGVGAN V2 or EVAGAN but I do not have a release date.