How to make semantic data from wav files?

by DeveloperEdy - opened Sep 1, 2023

Sep 1, 2023

Hi, I am Edy who want to make Korean tokenizer for bark voice cloning.
I am wondering how to make semantic data from wav source files in Japanese.
I appreciate any helps.

junwchina

Owner Sep 1, 2023

You need to follow these steps:

Generate semantic data from text(Korean content)
Generate wav files from above semantic data
Train Korean model from wavs and semantic data. Wav files is input of this model, semantic data is the output.

Basically, this model is used to predict semantic data from wav file. For more details, you can check my train script .

atlonxp

Dec 29, 2023

Thank you for the instruction. Just curiosity at step 1. creating data -- I saw it runs infinitely.

Do you have any recommendation how many semantic files should we produce? is there any magic number?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment