This is a streamlined interface version of WavTokenizer-large-speech-75token, providing a way to interact with the model through separate encoder and decoder components.

Reduced model size from 1.75GB to ~330MB by keeping only necessary components for inference
Split interface (82MB encoder, 248MB decoder)

The model is split into:

encoder/: Handles audio encoding
decoder/: Handles decoding and synthesis

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.