This is a streamlined interface version of WavTokenizer-large-speech-75token, providing a way to interact with the model through separate encoder and decoder components.

  • Reduced model size from 1.75GB to ~330MB by keeping only necessary components for inference
  • Split interface (82MB encoder, 248MB decoder)

The model is split into:

  • encoder/: Handles audio encoding
  • decoder/: Handles decoding and synthesis
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.