metadata
license: mit
easyGUI
easyGUI
is a user-friendly voice conversion framework based on VITS, designed to eliminate timbre leakage by replacing input features with those from the training set. It's efficient even on lower-end GPUs, requiring only about 10 minutes of low-noise speech data for good results. The framework features a simple web interface, supports A card and I card acceleration, and uses the advanced RMVPE algorithm for pitch extraction.
Installation
Prerequisites
- Python 3.8 or higher
Installation Steps
Install Pytorch:
pip install torch torchvision torchaudio
Install Dependencies:
pip install -r requirements.txt
3
Additional Setup
- Download Assets:
Download necessary models and files using the scripts in the
tools
directory. - Install FFmpeg:
sudo apt install ffmpeg
Usage
Start the WebUI:
python demo.py
Features
- Top1 retrieval to replace input features
- Fast training on less powerful GPUs
- Model merging to change timbre
- Advanced pitch extraction with RMVPE