[Kokoro] Use default params

#17
by hexgrad - opened

The default params for generate are reasonable. This should hopefully guard against breaking changes in the future.

https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena/discussions/16#6742fc42ac3539474978ae4a

@hexgrad There is an issue with Gradio API, no value to a slider means that it will take the minimum value not the default value. I doubt you want that.

@hexgrad There is an issue with Gradio API, no value to a slider means that it will take the minimum value not the default value. I doubt you want that.

@Pendrokar Thanks, fixed this PR. We can keep these 3 params, to avoid speed getting sent down to min/half:

        1: "af", #voice
        2: None, #ps
        3: 1, #speed

Most of the params being deleted are deprecated. And the two remaining params after the API change are optional and not that important for the Arena.

  • trim cuts from both ends. Min 0, default 3000, max 24000. Leading and trailing artifacts have been mostly trained out of the model by v0.19, at least for starred/stable voices, so it does not really matter whether this is 0 or 3000. Since it's 24khz audio, 3000 trim is only 0.125 seconds cut from each end. Edit: I'm not sure if trim will stick around in its current form, so I'd rather not have a value specified for the API call from the Arena.
  • use_gpu is a hardware selection enum ('auto', False, True). It's mostly just QoL for the Space so (1) audio arrives slightly faster on short inputs with CPU generation, and (2) if a user exhausts their ZeroGPU they can switch to CPU for still reasonable speed. Whether Kokoro is running on CPU or GPU, I think it should generally outspeed the model on the other side.
hexgrad changed pull request status to closed

Accidentally closed when I was trying to edit a message 😬 sorry for the whiplash

hexgrad changed pull request status to open
Pendrokar changed pull request status to merged

Sign up or log in to comment