--- license: mit language: - ja pipeline_tag: text-to-speech datasets: - litagin/moe-speech --- Following this guide with exceptions - https://rentry.org/GPT-SoVITS-guide I used the latest git pull from - https://github.com/RVC-Boss/GPT-SoVITS/ I created a Firefox screen-reader plugin to work with SoVITS - https://addons.mozilla.org/en-US/firefox/addon/sovits-screen-reader/ - (https://github.com/cpumaxx/sovits-ff-plugin) I needed to put: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/cudann/lib/ in my shell. put the pth file in your SoVITS_weights_v2 folder and the ckpt in GPT_weights_v2 make both Language for Reference audio and Inference text language "Japanese", set slicing to "Slice by every punct". you should be able to give it a CLEAN AND NOISE/MUSIC/STATIC free Japanese voice clip from 3-10 seconds, give it a 100% ACCURATE transcription and get very good results out the other side. you can try the wav file in the repo, using the filename as the "Text for reference audio" to test inference. Feel free to keep everything else at the defaults If you want to start the inference engine auomatically, you can use do something like python3 /path/to/GPT_SoVITS/inference_webui.py "Auto" If you isolate it ala https://rentry.org/IsolatedLinuxWebService and put nginx in front of it with an ssl cert, you need something like this in the location block: proxy_pass http://127.0.0.1:9872/; proxy_buffering off; proxy_redirect off; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; client_max_body_size 500M; proxy_set_header X-Forwarded-Proto $scheme; add_header 'Content-Security-Policy' 'upgrade-insecure-requests';