This is a streamlined interface version of WavTokenizer-large-speech-75token, providing a way to interact with the model through separate encoder and decoder components.
- Reduced model size from 1.75GB to ~330MB by keeping only necessary components for inference
- Split interface (82MB encoder, 248MB decoder)
The model is split into:
encoder/
: Handles audio encodingdecoder/
: Handles decoding and synthesis
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.