|
--- |
|
datasets: |
|
- mesolitica/Malaysian-Emilia |
|
language: |
|
- ms |
|
- en |
|
--- |
|
# Malaysian Vocos |
|
|
|
Pretrained [charactr/vocos-mel-24khz](https://huggingface.co/charactr/vocos-mel-24khz) from scratch on [Malaysian Emilia](https://huggingface.co/datasets/mesolitica/Malaysian-Emilia) to make it more crispy for Malaysian context! |
|
|
|
1. We increase number of layers. |
|
2. We increase hidden layer size. |
|
|
|
Wandb at https://wandb.ai/huseinzol05/malaysian_vocos_mel_v2?nw=nwuserhuseinzol05, **still on training** |
|
|
|
## Installation |
|
|
|
To use Vocos only in inference mode, install it using: |
|
|
|
```bash |
|
pip install vocos |
|
``` |
|
|
|
## Usage |
|
|
|
### Reconstruct audio from mel-spectrogram |
|
|
|
```python |
|
import torch |
|
|
|
from vocos import Vocos |
|
|
|
vocos = Vocos.from_pretrained("mesolitica/malaysian-vocos-mel-24khz") |
|
|
|
mel = torch.randn(1, 100, 256) # B, C, T |
|
audio = vocos.decode(mel) |
|
``` |