Descript Audio Codec

πŸ‘‰ With Descript Audio Codec, you can compress 44.1 KHz audio into discrete codes at a low 8 kbps bitrate.
🀌 That's approximately 90x compression while maintaining exceptional fidelity and minimizing artifacts.
πŸ’ͺ Our universal model works on all domains (speech, environment, music, etc.), making it widely applicable to generative modeling of all audio.
πŸ‘Œ It can be used as a drop-in replacement for EnCodec for all audio language modeling applications (such as AudioLMs, MusicLMs, MusicGen, etc.)

Model Details

Model Description

  • License: MIT

Model Sources

Uses

The model is intended for compressing audio files containing speech, music and environmental sounds.

Out-of-Scope Use

It is not intended to be used for compressing other file formats such as text, images, etc.

Bias, Risks, and Limitations

Our model has difficulty reconstructing some challenging audio. It performs best for speech and has more issues with environmental sounds. It does not model some musical instruments perfectly, such as glockenspeil, or synthesizer sounds.

How to Get Started with the Model

This model is meant to be used with our official repo linked above. We release the model here for redundancy purposes. Our code is able to pull the weights from their original location on Github. Please refer to the official README for usage instructions.

Citation

BibTeX:

@misc{kumar2023highfidelity,
      title={High-Fidelity Audio Compression with Improved RVQGAN}, 
      author={Rithesh Kumar and Prem Seetharaman and Alejandro Luebs and Ishaan Kumar and Kundan Kumar},
      year={2023},
      eprint={2306.06546},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .