thanks to memoavatar ❤
Browse files- .gitattributes +1 -0
- README.md +77 -0
- audio_proj/config.json +4 -0
- audio_proj/diffusion_pytorch_model.safetensors +3 -0
- diffusion_net/config.json +87 -0
- diffusion_net/diffusion_pytorch_model.safetensors +3 -0
- image_proj/config.json +4 -0
- image_proj/diffusion_pytorch_model.safetensors +3 -0
- misc/audio_emotion_classifier/config.json +4 -0
- misc/audio_emotion_classifier/diffusion_pytorch_model.safetensors +3 -0
- misc/face_analysis/models/1k3d68.onnx +3 -0
- misc/face_analysis/models/2d106det.onnx +3 -0
- misc/face_analysis/models/face_landmarker_v2_with_blendshapes.task +3 -0
- misc/face_analysis/models/genderage.onnx +3 -0
- misc/face_analysis/models/glintr100.onnx +3 -0
- misc/face_analysis/models/scrfd_10g_bnkps.onnx +3 -0
- misc/vocal_separator/Kim_Vocal_2.onnx +3 -0
- misc/vocal_separator/download_checks.json +262 -0
- misc/vocal_separator/mdx_model_data.json +415 -0
- misc/vocal_separator/vr_model_data.json +137 -0
- reference_net/config.json +65 -0
- reference_net/diffusion_pytorch_model.safetensors +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
misc/face_analysis/models/face_landmarker_v2_with_blendshapes.task filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,77 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
|
5 |
+
# MEMO
|
6 |
+
|
7 |
+
**MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation**
|
8 |
+
<br>
|
9 |
+
[Longtao Zheng](https://ltzheng.github.io)\*,
|
10 |
+
[Yifan Zhang](https://scholar.google.com/citations?user=zuYIUJEAAAAJ)\*,
|
11 |
+
[Hanzhong Guo](https://scholar.google.com/citations?user=q3x6KsgAAAAJ)\,
|
12 |
+
[Jiachun Pan](https://scholar.google.com/citations?user=nrOvfb4AAAAJ),
|
13 |
+
[Zhenxiong Tan](https://scholar.google.com/citations?user=HP9Be6UAAAAJ),
|
14 |
+
[Jiahao Lu](https://scholar.google.com/citations?user=h7rbA-sAAAAJ),
|
15 |
+
[Chuanxin Tang](https://scholar.google.com/citations?user=3ZC8B7MAAAAJ),
|
16 |
+
[Bo An](https://personal.ntu.edu.sg/boan/index.html),
|
17 |
+
[Shuicheng Yan](https://scholar.google.com/citations?user=DNuiPHwAAAAJ)
|
18 |
+
<br>
|
19 |
+
_[Project Page](https://memoavatar.github.io) | [arXiv](https://arxiv.org/abs/2412.04448) | [Model](https://huggingface.co/memoavatar/memo)_
|
20 |
+
|
21 |
+
This repository contains the example inference script for the MEMO-preview model. The gif demo below is compressed. See our [project page](https://memoavatar.github.io) for full videos.
|
22 |
+
|
23 |
+
<div style="width: 100%; text-align: center;">
|
24 |
+
<img src="https://github.com/memoavatar/memo/raw/main/assets/demo.gif" alt="Demo GIF" style="width: 100%; height: auto;">
|
25 |
+
</div>
|
26 |
+
|
27 |
+
## Installation
|
28 |
+
|
29 |
+
```bash
|
30 |
+
conda create -n memo python=3.10 -y
|
31 |
+
conda activate memo
|
32 |
+
conda install -c conda-forge ffmpeg -y
|
33 |
+
pip install -e .
|
34 |
+
```
|
35 |
+
|
36 |
+
> Our code will download the checkpoint from Hugging Face automatically, and the models for face analysis and vocal separation will be downloaded to `misc_model_dir` of `configs/inference.yaml`. If you want to download the models manually, please download the checkpoint from [here](https://huggingface.co/memoavatar/memo) and specify the path in `model_name_or_path` of `configs/inference.yaml`.
|
37 |
+
|
38 |
+
## Inference
|
39 |
+
|
40 |
+
```bash
|
41 |
+
python inference.py --config configs/inference.yaml --input_image <IMAGE_PATH> --input_audio <AUDIO_PATH> --output_dir <SAVE_PATH>
|
42 |
+
```
|
43 |
+
|
44 |
+
For example:
|
45 |
+
|
46 |
+
```bash
|
47 |
+
python inference.py --config configs/inference.yaml --input_image assets/examples/dicaprio.jpg --input_audio assets/examples/speech.wav --output_dir outputs
|
48 |
+
```
|
49 |
+
|
50 |
+
> We tested the code on H100 and RTX 4090 GPUs using CUDA 12. Under the default settings (fps=30, inference_steps=20), the inference time is around 1 second per frame on H100 and 2 seconds per frame on RTX 4090. We welcome community contributions to improve the inference speed or interfaces like ComfyUI.
|
51 |
+
|
52 |
+
## Acknowledgement
|
53 |
+
|
54 |
+
Our work is made possible thanks to high-quality open-source talking video datasets (including [HDTF](https://github.com/MRzzm/HDTF), [VFHQ](https://liangbinxie.github.io/projects/vfhq), [CelebV-HQ](https://celebv-hq.github.io), [MultiTalk](https://multi-talk.github.io), and [MEAD](https://wywu.github.io/projects/MEAD/MEAD.html)) and some pioneering works (such as [EMO](https://humanaigc.github.io/emote-portrait-alive) and [Hallo](https://github.com/fudan-generative-vision/hallo)).
|
55 |
+
|
56 |
+
## Ethics Statement
|
57 |
+
|
58 |
+
We acknowledge the potential of AI in generating talking videos, with applications spanning education, virtual assistants, and entertainment. However, we are equally aware of the ethical, legal, and societal challenges that misuse of this technology could pose.
|
59 |
+
|
60 |
+
To reduce potential risks, we have only open-sourced a preview model for research purposes. Demos on our website use publicly available materials. We welcome copyright concerns—please contact us if needed, and we will address issues promptly. Users are required to ensure that their actions align with legal regulations, cultural norms, and ethical standards.
|
61 |
+
|
62 |
+
It is strictly prohibited to use the model for creating malicious, misleading, defamatory, or privacy-infringing content, such as deepfake videos for political misinformation, impersonation, harassment, or fraud. We strongly encourage users to review generated content carefully, ensuring it meets ethical guidelines and respects the rights of all parties involved. Users must also ensure that their inputs (e.g., audio and reference images) and outputs are used with proper authorization. Unauthorized use of third-party intellectual property is strictly forbidden.
|
63 |
+
|
64 |
+
While users may claim ownership of content generated by the model, they must ensure compliance with copyright laws, particularly when involving public figures' likeness, voice, or other aspects protected under personality rights.
|
65 |
+
|
66 |
+
## Citation
|
67 |
+
|
68 |
+
If you find our work useful, please use the following citation:
|
69 |
+
|
70 |
+
```bibtex
|
71 |
+
@article{zheng2024memo,
|
72 |
+
title={MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation},
|
73 |
+
author={Longtao Zheng and Yifan Zhang and Hanzhong Guo and Jiachun Pan and Zhenxiong Tan and Jiahao Lu and Chuanxin Tang and Bo An and Shuicheng Yan},
|
74 |
+
journal={arXiv preprint arXiv:2412.04448},
|
75 |
+
year={2024}
|
76 |
+
}
|
77 |
+
```
|
audio_proj/config.json
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_class_name": "AudioProjModel",
|
3 |
+
"_diffusers_version": "0.31.0"
|
4 |
+
}
|
audio_proj/diffusion_pytorch_model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cb32c18e7890c6550e2312964a48f447b5e30cd449c25cfe06324842d6146f6e
|
3 |
+
size 145861272
|
diffusion_net/config.json
ADDED
@@ -0,0 +1,87 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_center_input_sample": false,
|
3 |
+
"_class_name": "UNet3DConditionModel",
|
4 |
+
"_diffusers_version": "0.31.0",
|
5 |
+
"_out_channels": 4,
|
6 |
+
"act_fn": "silu",
|
7 |
+
"addition_embed_type": null,
|
8 |
+
"addition_embed_type_num_heads": 64,
|
9 |
+
"addition_time_embed_dim": null,
|
10 |
+
"attention_head_dim": 8,
|
11 |
+
"attention_type": "default",
|
12 |
+
"audio_attention_dim": 768,
|
13 |
+
"block_out_channels": [
|
14 |
+
320,
|
15 |
+
640,
|
16 |
+
1280,
|
17 |
+
1280
|
18 |
+
],
|
19 |
+
"center_input_sample": false,
|
20 |
+
"class_embed_type": null,
|
21 |
+
"class_embeddings_concat": false,
|
22 |
+
"conv_in_kernel": 3,
|
23 |
+
"cross_attention_dim": 768,
|
24 |
+
"down_block_types": [
|
25 |
+
"CrossAttnDownBlock3D",
|
26 |
+
"CrossAttnDownBlock3D",
|
27 |
+
"CrossAttnDownBlock3D",
|
28 |
+
"DownBlock3D"
|
29 |
+
],
|
30 |
+
"downsample_padding": 1,
|
31 |
+
"dropout": 0.0,
|
32 |
+
"dual_cross_attention": false,
|
33 |
+
"emo_drop_rate": 0.05,
|
34 |
+
"encoder_hid_dim": null,
|
35 |
+
"encoder_hid_dim_type": null,
|
36 |
+
"flip_sin_to_cos": true,
|
37 |
+
"freq_shift": 0,
|
38 |
+
"in_channels": 4,
|
39 |
+
"layers_per_block": 2,
|
40 |
+
"mid_block_only_cross_attention": null,
|
41 |
+
"mid_block_scale_factor": 1,
|
42 |
+
"mid_block_type": "UNetMidBlock3DCrossAttn",
|
43 |
+
"motion_module_kwargs": {
|
44 |
+
"attention_block_types": [
|
45 |
+
"Temporal_Self",
|
46 |
+
"Temporal_Self"
|
47 |
+
],
|
48 |
+
"num_attention_heads": 8,
|
49 |
+
"num_transformer_block": 1,
|
50 |
+
"temporal_attention_dim_div": 1,
|
51 |
+
"temporal_position_encoding": true,
|
52 |
+
"temporal_position_encoding_max_len": 32
|
53 |
+
},
|
54 |
+
"motion_module_resolutions": [
|
55 |
+
1,
|
56 |
+
2,
|
57 |
+
4,
|
58 |
+
8
|
59 |
+
],
|
60 |
+
"norm_eps": 1e-05,
|
61 |
+
"norm_num_groups": 32,
|
62 |
+
"num_attention_heads": null,
|
63 |
+
"num_class_embeds": null,
|
64 |
+
"only_cross_attention": false,
|
65 |
+
"out_channels": 4,
|
66 |
+
"projection_class_embeddings_input_dim": null,
|
67 |
+
"resnet_time_scale_shift": "default",
|
68 |
+
"reverse_transformer_layers_per_block": null,
|
69 |
+
"sample_size": 64,
|
70 |
+
"time_cond_proj_dim": null,
|
71 |
+
"time_embedding_act_fn": null,
|
72 |
+
"time_embedding_dim": null,
|
73 |
+
"time_embedding_type": "positional",
|
74 |
+
"timestep_post_act": null,
|
75 |
+
"transformer_layers_per_block": 1,
|
76 |
+
"unet_use_cross_frame_attention": false,
|
77 |
+
"unet_use_temporal_attention": false,
|
78 |
+
"up_block_types": [
|
79 |
+
"UpBlock3D",
|
80 |
+
"CrossAttnUpBlock3D",
|
81 |
+
"CrossAttnUpBlock3D",
|
82 |
+
"CrossAttnUpBlock3D"
|
83 |
+
],
|
84 |
+
"upcast_attention": false,
|
85 |
+
"use_inflated_groupnorm": true,
|
86 |
+
"use_linear_projection": false
|
87 |
+
}
|
diffusion_net/diffusion_pytorch_model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7581d7e0663fd27a3c7b2b242a7af5eda89e57c67e3259017f8b77d83b930479
|
3 |
+
size 6712434824
|
image_proj/config.json
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_class_name": "ImageProjModel",
|
3 |
+
"_diffusers_version": "0.31.0"
|
4 |
+
}
|
image_proj/diffusion_pytorch_model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:eac89d81e17f120f752548d028b4a9a9ad4abca9401590436b5c8c26d8cd8537
|
3 |
+
size 6310216
|
misc/audio_emotion_classifier/config.json
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_class_name": "AudioEmotionClassifierModel",
|
3 |
+
"_diffusers_version": "0.31.0"
|
4 |
+
}
|
misc/audio_emotion_classifier/diffusion_pytorch_model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e7c8ca4bcfd1695efcf80398d13e4a5f365ebba0d70052f24a8c232ee50ee76d
|
3 |
+
size 58827684
|
misc/face_analysis/models/1k3d68.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:df5c06b8a0c12e422b2ed8947b8869faa4105387f199c477af038aa01f9a45cc
|
3 |
+
size 143607619
|
misc/face_analysis/models/2d106det.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f001b856447c413801ef5c42091ed0cd516fcd21f2d6b79635b1e733a7109dbf
|
3 |
+
size 5030888
|
misc/face_analysis/models/face_landmarker_v2_with_blendshapes.task
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:64184e229b263107bc2b804c6625db1341ff2bb731874b0bcc2fe6544e0bc9ff
|
3 |
+
size 3758596
|
misc/face_analysis/models/genderage.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4fde69b1c810857b88c64a335084f1c3fe8f01246c9a191b48c7bb756d6652fb
|
3 |
+
size 1322532
|
misc/face_analysis/models/glintr100.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4ab1d6435d639628a6f3e5008dd4f929edf4c4124b1a7169e1048f9fef534cdf
|
3 |
+
size 260665334
|
misc/face_analysis/models/scrfd_10g_bnkps.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5838f7fe053675b1c7a08b633df49e7af5495cee0493c7dcf6697200b85b5b91
|
3 |
+
size 16923827
|
misc/vocal_separator/Kim_Vocal_2.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ce74ef3b6a6024ce44211a07be9cf8bc6d87728cc852a68ab34eb8e58cde9c8b
|
3 |
+
size 66759214
|
misc/vocal_separator/download_checks.json
ADDED
@@ -0,0 +1,262 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"current_version": "UVR_Patch_10_6_23_4_27",
|
3 |
+
"current_version_ocl": "UVR_Patch_10_6_23_4_27",
|
4 |
+
"current_version_mac": "UVR_Patch_10_6_23_4_27",
|
5 |
+
"current_version_linux": "UVR_Patch_10_6_23_4_27",
|
6 |
+
"vr_download_list": {
|
7 |
+
"VR Arch Single Model v5: 1_HP-UVR": "1_HP-UVR.pth",
|
8 |
+
"VR Arch Single Model v5: 2_HP-UVR": "2_HP-UVR.pth",
|
9 |
+
"VR Arch Single Model v5: 3_HP-Vocal-UVR": "3_HP-Vocal-UVR.pth",
|
10 |
+
"VR Arch Single Model v5: 4_HP-Vocal-UVR": "4_HP-Vocal-UVR.pth",
|
11 |
+
"VR Arch Single Model v5: 5_HP-Karaoke-UVR": "5_HP-Karaoke-UVR.pth",
|
12 |
+
"VR Arch Single Model v5: 6_HP-Karaoke-UVR": "6_HP-Karaoke-UVR.pth",
|
13 |
+
"VR Arch Single Model v5: 7_HP2-UVR": "7_HP2-UVR.pth",
|
14 |
+
"VR Arch Single Model v5: 8_HP2-UVR": "8_HP2-UVR.pth",
|
15 |
+
"VR Arch Single Model v5: 9_HP2-UVR": "9_HP2-UVR.pth",
|
16 |
+
"VR Arch Single Model v5: 10_SP-UVR-2B-32000-1": "10_SP-UVR-2B-32000-1.pth",
|
17 |
+
"VR Arch Single Model v5: 11_SP-UVR-2B-32000-2": "11_SP-UVR-2B-32000-2.pth",
|
18 |
+
"VR Arch Single Model v5: 12_SP-UVR-3B-44100": "12_SP-UVR-3B-44100.pth",
|
19 |
+
"VR Arch Single Model v5: 13_SP-UVR-4B-44100-1": "13_SP-UVR-4B-44100-1.pth",
|
20 |
+
"VR Arch Single Model v5: 14_SP-UVR-4B-44100-2": "14_SP-UVR-4B-44100-2.pth",
|
21 |
+
"VR Arch Single Model v5: 15_SP-UVR-MID-44100-1": "15_SP-UVR-MID-44100-1.pth",
|
22 |
+
"VR Arch Single Model v5: 16_SP-UVR-MID-44100-2": "16_SP-UVR-MID-44100-2.pth",
|
23 |
+
"VR Arch Single Model v5: 17_HP-Wind_Inst-UVR": "17_HP-Wind_Inst-UVR.pth",
|
24 |
+
"VR Arch Single Model v5: UVR-De-Echo-Aggressive by FoxJoy": "UVR-De-Echo-Aggressive.pth",
|
25 |
+
"VR Arch Single Model v5: UVR-De-Echo-Normal by FoxJoy": "UVR-De-Echo-Normal.pth",
|
26 |
+
"VR Arch Single Model v5: UVR-DeEcho-DeReverb by FoxJoy": "UVR-DeEcho-DeReverb.pth",
|
27 |
+
"VR Arch Single Model v5: UVR-DeNoise-Lite by FoxJoy": "UVR-DeNoise-Lite.pth",
|
28 |
+
"VR Arch Single Model v5: UVR-DeNoise by FoxJoy": "UVR-DeNoise.pth",
|
29 |
+
"VR Arch Single Model v5: UVR-BVE-4B_SN-44100-1": "UVR-BVE-4B_SN-44100-1.pth",
|
30 |
+
"VR Arch Single Model v4: MGM_HIGHEND_v4": "MGM_HIGHEND_v4.pth",
|
31 |
+
"VR Arch Single Model v4: MGM_LOWEND_A_v4": "MGM_LOWEND_A_v4.pth",
|
32 |
+
"VR Arch Single Model v4: MGM_LOWEND_B_v4": "MGM_LOWEND_B_v4.pth",
|
33 |
+
"VR Arch Single Model v4: MGM_MAIN_v4": "MGM_MAIN_v4.pth"
|
34 |
+
},
|
35 |
+
|
36 |
+
"mdx_download_list": {
|
37 |
+
"MDX-Net Model: UVR-MDX-NET Inst HQ 1": "UVR-MDX-NET-Inst_HQ_1.onnx",
|
38 |
+
"MDX-Net Model: UVR-MDX-NET Inst HQ 2": "UVR-MDX-NET-Inst_HQ_2.onnx",
|
39 |
+
"MDX-Net Model: UVR-MDX-NET Inst HQ 3": "UVR-MDX-NET-Inst_HQ_3.onnx",
|
40 |
+
"MDX-Net Model: UVR-MDX-NET Inst HQ 4": "UVR-MDX-NET-Inst_HQ_4.onnx",
|
41 |
+
"MDX-Net Model: UVR-MDX-NET Inst HQ 5": "UVR-MDX-NET-Inst_HQ_5.onnx",
|
42 |
+
"MDX-Net Model: UVR-MDX-NET Main": "UVR_MDXNET_Main.onnx",
|
43 |
+
"MDX-Net Model: UVR-MDX-NET Inst Main": "UVR-MDX-NET-Inst_Main.onnx",
|
44 |
+
"MDX-Net Model: UVR-MDX-NET 1": "UVR_MDXNET_1_9703.onnx",
|
45 |
+
"MDX-Net Model: UVR-MDX-NET 2": "UVR_MDXNET_2_9682.onnx",
|
46 |
+
"MDX-Net Model: UVR-MDX-NET 3": "UVR_MDXNET_3_9662.onnx",
|
47 |
+
"MDX-Net Model: UVR-MDX-NET Inst 1": "UVR-MDX-NET-Inst_1.onnx",
|
48 |
+
"MDX-Net Model: UVR-MDX-NET Inst 2": "UVR-MDX-NET-Inst_2.onnx",
|
49 |
+
"MDX-Net Model: UVR-MDX-NET Inst 3": "UVR-MDX-NET-Inst_3.onnx",
|
50 |
+
"MDX-Net Model: UVR-MDX-NET Karaoke": "UVR_MDXNET_KARA.onnx",
|
51 |
+
"MDX-Net Model: UVR-MDX-NET Karaoke 2": "UVR_MDXNET_KARA_2.onnx",
|
52 |
+
"MDX-Net Model: UVR_MDXNET_9482": "UVR_MDXNET_9482.onnx",
|
53 |
+
"MDX-Net Model: UVR-MDX-NET Voc FT": "UVR-MDX-NET-Voc_FT.onnx",
|
54 |
+
"MDX-Net Model: Kim Vocal 1": "Kim_Vocal_1.onnx",
|
55 |
+
"MDX-Net Model: Kim Vocal 2": "Kim_Vocal_2.onnx",
|
56 |
+
"MDX-Net Model: Kim Inst": "Kim_Inst.onnx",
|
57 |
+
"MDX-Net Model: Reverb HQ By FoxJoy": "Reverb_HQ_By_FoxJoy.onnx",
|
58 |
+
"MDX-Net Model: UVR-MDX-NET Crowd HQ 1 By Aufr33": "UVR-MDX-NET_Crowd_HQ_1.onnx",
|
59 |
+
"MDX-Net Model: kuielab_a_vocals": "kuielab_a_vocals.onnx",
|
60 |
+
"MDX-Net Model: kuielab_a_other": "kuielab_a_other.onnx",
|
61 |
+
"MDX-Net Model: kuielab_a_bass": "kuielab_a_bass.onnx",
|
62 |
+
"MDX-Net Model: kuielab_a_drums": "kuielab_a_drums.onnx",
|
63 |
+
"MDX-Net Model: kuielab_b_vocals": "kuielab_b_vocals.onnx",
|
64 |
+
"MDX-Net Model: kuielab_b_other": "kuielab_b_other.onnx",
|
65 |
+
"MDX-Net Model: kuielab_b_bass": "kuielab_b_bass.onnx",
|
66 |
+
"MDX-Net Model: kuielab_b_drums": "kuielab_b_drums.onnx"
|
67 |
+
},
|
68 |
+
|
69 |
+
"demucs_download_list":{
|
70 |
+
|
71 |
+
"Demucs v4: htdemucs_ft":{
|
72 |
+
"f7e0c4bc-ba3fe64a.th":"https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/f7e0c4bc-ba3fe64a.th",
|
73 |
+
"d12395a8-e57c48e6.th":"https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/d12395a8-e57c48e6.th",
|
74 |
+
"92cfc3b6-ef3bcb9c.th":"https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/92cfc3b6-ef3bcb9c.th",
|
75 |
+
"04573f0d-f3cf25b2.th":"https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/04573f0d-f3cf25b2.th",
|
76 |
+
"htdemucs_ft.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/htdemucs_ft.yaml"
|
77 |
+
},
|
78 |
+
|
79 |
+
"Demucs v4: htdemucs":{
|
80 |
+
"955717e8-8726e21a.th": "https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th",
|
81 |
+
"htdemucs.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/htdemucs.yaml"
|
82 |
+
},
|
83 |
+
|
84 |
+
"Demucs v4: hdemucs_mmi":{
|
85 |
+
"75fc33f5-1941ce65.th": "https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/75fc33f5-1941ce65.th",
|
86 |
+
"hdemucs_mmi.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/hdemucs_mmi.yaml"
|
87 |
+
},
|
88 |
+
"Demucs v4: htdemucs_6s":{
|
89 |
+
"5c90dfd2-34c22ccb.th": "https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th",
|
90 |
+
"htdemucs_6s.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/htdemucs_6s.yaml"
|
91 |
+
},
|
92 |
+
"Demucs v3: mdx":{
|
93 |
+
"0d19c1c6-0f06f20e.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/0d19c1c6-0f06f20e.th",
|
94 |
+
"7ecf8ec1-70f50cc9.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/7ecf8ec1-70f50cc9.th",
|
95 |
+
"c511e2ab-fe698775.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/c511e2ab-fe698775.th",
|
96 |
+
"7d865c68-3d5dd56b.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/7d865c68-3d5dd56b.th",
|
97 |
+
"mdx.yaml": "https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx.yaml"
|
98 |
+
},
|
99 |
+
|
100 |
+
"Demucs v3: mdx_q":{
|
101 |
+
"6b9c2ca1-3fd82607.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/6b9c2ca1-3fd82607.th",
|
102 |
+
"b72baf4e-8778635e.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/b72baf4e-8778635e.th",
|
103 |
+
"42e558d4-196e0e1b.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/42e558d4-196e0e1b.th",
|
104 |
+
"305bc58f-18378783.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/305bc58f-18378783.th",
|
105 |
+
"mdx_q.yaml": "https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx_q.yaml"
|
106 |
+
},
|
107 |
+
|
108 |
+
"Demucs v3: mdx_extra":{
|
109 |
+
"e51eebcc-c1b80bdd.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/e51eebcc-c1b80bdd.th",
|
110 |
+
"a1d90b5c-ae9d2452.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/a1d90b5c-ae9d2452.th",
|
111 |
+
"5d2d6c55-db83574e.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/5d2d6c55-db83574e.th",
|
112 |
+
"cfa93e08-61801ae1.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/cfa93e08-61801ae1.th",
|
113 |
+
"mdx_extra.yaml": "https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx_extra.yaml"
|
114 |
+
},
|
115 |
+
|
116 |
+
"Demucs v3: mdx_extra_q": {
|
117 |
+
"83fc094f-4a16d450.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/83fc094f-4a16d450.th",
|
118 |
+
"464b36d7-e5a9386e.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/464b36d7-e5a9386e.th",
|
119 |
+
"14fc6a69-a89dd0ee.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/14fc6a69-a89dd0ee.th",
|
120 |
+
"7fd6ef75-a905dd85.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/7fd6ef75-a905dd85.th",
|
121 |
+
"mdx_extra_q.yaml": "https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx_extra_q.yaml"
|
122 |
+
},
|
123 |
+
|
124 |
+
"Demucs v3: UVR Model":{
|
125 |
+
"ebf34a2db.th": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/ebf34a2db.th",
|
126 |
+
"UVR_Demucs_Model_1.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/UVR_Demucs_Model_1.yaml"
|
127 |
+
},
|
128 |
+
|
129 |
+
"Demucs v3: repro_mdx_a":{
|
130 |
+
"9a6b4851-03af0aa6.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/9a6b4851-03af0aa6.th",
|
131 |
+
"1ef250f1-592467ce.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/1ef250f1-592467ce.th",
|
132 |
+
"fa0cb7f9-100d8bf4.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/fa0cb7f9-100d8bf4.th",
|
133 |
+
"902315c2-b39ce9c9.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/902315c2-b39ce9c9.th",
|
134 |
+
"repro_mdx_a.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/repro_mdx_a.yaml"
|
135 |
+
},
|
136 |
+
|
137 |
+
"Demucs v3: repro_mdx_a_time_only":{
|
138 |
+
"9a6b4851-03af0aa6.th":"https://dl.fbaipublicfiles.com/demucs/mdx_final/9a6b4851-03af0aa6.th",
|
139 |
+
"1ef250f1-592467ce.th":"https://dl.fbaipublicfiles.com/demucs/mdx_final/1ef250f1-592467ce.th",
|
140 |
+
"repro_mdx_a_time_only.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/repro_mdx_a_time_only.yaml"
|
141 |
+
},
|
142 |
+
|
143 |
+
"Demucs v3: repro_mdx_a_hybrid_only":{
|
144 |
+
"fa0cb7f9-100d8bf4.th":"https://dl.fbaipublicfiles.com/demucs/mdx_final/fa0cb7f9-100d8bf4.th",
|
145 |
+
"902315c2-b39ce9c9.th":"https://dl.fbaipublicfiles.com/demucs/mdx_final/902315c2-b39ce9c9.th",
|
146 |
+
"repro_mdx_a_hybrid_only.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/repro_mdx_a_hybrid_only.yaml"
|
147 |
+
},
|
148 |
+
|
149 |
+
"Demucs v2: demucs": {
|
150 |
+
"demucs-e07c671f.th": "https://dl.fbaipublicfiles.com/demucs/v3.0/demucs-e07c671f.th"
|
151 |
+
},
|
152 |
+
|
153 |
+
"Demucs v2: demucs_extra": {
|
154 |
+
"demucs_extra-3646af93.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/demucs_extra-3646af93.th"
|
155 |
+
},
|
156 |
+
|
157 |
+
"Demucs v2: demucs48_hq": {
|
158 |
+
"demucs48_hq-28a1282c.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/demucs48_hq-28a1282c.th"
|
159 |
+
},
|
160 |
+
|
161 |
+
"Demucs v2: tasnet": {
|
162 |
+
"tasnet-beb46fac.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/tasnet-beb46fac.th"
|
163 |
+
},
|
164 |
+
|
165 |
+
"Demucs v2: tasnet_extra": {
|
166 |
+
"tasnet_extra-df3777b2.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/tasnet_extra-df3777b2.th"
|
167 |
+
},
|
168 |
+
|
169 |
+
"Demucs v2: demucs_unittest": {
|
170 |
+
"demucs_unittest-09ebc15f.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/demucs_unittest-09ebc15f.th"
|
171 |
+
},
|
172 |
+
|
173 |
+
"Demucs v1: demucs": {
|
174 |
+
"demucs.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/demucs.th"
|
175 |
+
},
|
176 |
+
|
177 |
+
"Demucs v1: demucs_extra": {
|
178 |
+
"demucs_extra.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/demucs_extra.th"
|
179 |
+
},
|
180 |
+
|
181 |
+
"Demucs v1: light": {
|
182 |
+
"light.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/light.th"
|
183 |
+
},
|
184 |
+
|
185 |
+
"Demucs v1: light_extra": {
|
186 |
+
"light_extra.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/light_extra.th"
|
187 |
+
},
|
188 |
+
|
189 |
+
"Demucs v1: tasnet": {
|
190 |
+
"tasnet.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/tasnet.th"
|
191 |
+
},
|
192 |
+
|
193 |
+
"Demucs v1: tasnet_extra": {
|
194 |
+
"tasnet_extra.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/tasnet_extra.th"
|
195 |
+
}
|
196 |
+
},
|
197 |
+
|
198 |
+
"mdx_download_vip_list": {
|
199 |
+
"MDX-Net Model VIP: UVR-MDX-NET_Main_340": "UVR-MDX-NET_Main_340.onnx",
|
200 |
+
"MDX-Net Model VIP: UVR-MDX-NET_Main_390": "UVR-MDX-NET_Main_390.onnx",
|
201 |
+
"MDX-Net Model VIP: UVR-MDX-NET_Main_406": "UVR-MDX-NET_Main_406.onnx",
|
202 |
+
"MDX-Net Model VIP: UVR-MDX-NET_Main_427": "UVR-MDX-NET_Main_427.onnx",
|
203 |
+
"MDX-Net Model VIP: UVR-MDX-NET_Main_438": "UVR-MDX-NET_Main_438.onnx",
|
204 |
+
"MDX-Net Model VIP: UVR-MDX-NET_Inst_82_beta": "UVR-MDX-NET_Inst_82_beta.onnx",
|
205 |
+
"MDX-Net Model VIP: UVR-MDX-NET_Inst_90_beta": "UVR-MDX-NET_Inst_90_beta.onnx",
|
206 |
+
"MDX-Net Model VIP: UVR-MDX-NET_Inst_187_beta": "UVR-MDX-NET_Inst_187_beta.onnx",
|
207 |
+
"MDX-Net Model VIP: UVR-MDX-NET-Inst_full_292": "UVR-MDX-NET-Inst_full_292.onnx"
|
208 |
+
},
|
209 |
+
|
210 |
+
"mdx23_download_list": {
|
211 |
+
"MDX23C Model: MDX23C_D1581": {"MDX23C_D1581.ckpt":"model_2_stem_061321.yaml"}
|
212 |
+
},
|
213 |
+
|
214 |
+
"mdx23c_download_list": {
|
215 |
+
"MDX23C Model: MDX23C-InstVoc HQ": {"MDX23C-8KFFT-InstVoc_HQ.ckpt":"model_2_stem_full_band_8k.yaml"}
|
216 |
+
},
|
217 |
+
|
218 |
+
"roformer_download_list": {
|
219 |
+
"Roformer Model: BS-Roformer-Viperx-1297": {"model_bs_roformer_ep_317_sdr_12.9755.ckpt":"model_bs_roformer_ep_317_sdr_12.9755.yaml"},
|
220 |
+
"Roformer Model: BS-Roformer-Viperx-1296": {"model_bs_roformer_ep_368_sdr_12.9628.ckpt":"model_bs_roformer_ep_368_sdr_12.9628.yaml"},
|
221 |
+
"Roformer Model: BS-Roformer-Viperx-1053": {"model_bs_roformer_ep_937_sdr_10.5309.ckpt":"model_bs_roformer_ep_937_sdr_10.5309.yaml"},
|
222 |
+
"Roformer Model: Mel-Roformer-Viperx-1143": {"model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt":"model_mel_band_roformer_ep_3005_sdr_11.4360.yaml"},
|
223 |
+
"Roformer Model: MelBand Roformer Kim | Inst V1 by Unwa": {"melband_roformer_inst_v1.ckpt":"config_melbandroformer_inst.yaml"},
|
224 |
+
"Roformer Model: MelBand Roformer Kim | Inst V2 by Unwa": {"melband_roformer_inst_v2.ckpt":"config_melbandroformer_inst_v2.yaml"},
|
225 |
+
"Roformer Model: MelBand Roformer Kim | InstVoc Duality V1 by Unwa": {"melband_roformer_instvoc_duality_v1.ckpt":"config_melbandroformer_instvoc_duality.yaml"},
|
226 |
+
"Roformer Model: MelBand Roformer Kim | InstVoc Duality V2 by Unwa": {"melband_roformer_instvox_duality_v2.ckpt":"config_melbandroformer_instvoc_duality.yaml"}
|
227 |
+
},
|
228 |
+
|
229 |
+
"other_network_list": {
|
230 |
+
"Roformer Model: BS-Roformer-Viperx-1297": {"model_bs_roformer_ep_317_sdr_12.9755.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/model_bs_roformer_ep_317_sdr_12.9755.ckpt",
|
231 |
+
"model_bs_roformer_ep_317_sdr_12.9755.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/model_bs_roformer_ep_317_sdr_12.9755.yaml"},
|
232 |
+
"Roformer Model: BS-Roformer-Viperx-1296": {"model_bs_roformer_ep_368_sdr_12.9628.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/model_bs_roformer_ep_368_sdr_12.9628.ckpt",
|
233 |
+
"model_bs_roformer_ep_368_sdr_12.9628.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/model_bs_roformer_ep_368_sdr_12.9628.yaml"},
|
234 |
+
"Roformer Model: BS-Roformer-Viperx-1053": {"model_bs_roformer_ep_937_sdr_10.5309.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/model_bs_roformer_ep_937_sdr_10.5309.ckpt",
|
235 |
+
"model_bs_roformer_ep_937_sdr_10.5309.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/model_bs_roformer_ep_937_sdr_10.5309.yaml"},
|
236 |
+
"Roformer Model: Mel-Roformer-Viperx-1143": {"model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt",
|
237 |
+
"model_mel_band_roformer_ep_3005_sdr_11.4360.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/model_mel_band_roformer_ep_3005_sdr_11.4360.yaml"},
|
238 |
+
"Roformer Model: MelBand Roformer Kim | Inst V1 by Unwa": {"melband_roformer_inst_v1.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-Inst/resolve/main/melband_roformer_inst_v1.ckpt",
|
239 |
+
"config_melbandroformer_inst.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_inst.yaml"},
|
240 |
+
"Roformer Model: MelBand Roformer Kim | Inst V2 by Unwa": {"melband_roformer_inst_v2.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-Inst/resolve/main/melband_roformer_inst_v2.ckpt",
|
241 |
+
"config_melbandroformer_inst_v2.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_inst_v2.yaml"},
|
242 |
+
"Roformer Model: MelBand Roformer Kim | InstVoc Duality V1 by Unwa": {"melband_roformer_instvoc_duality_v1.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-InstVoc-Duality/resolve/main/melband_roformer_instvoc_duality_v1.ckpt",
|
243 |
+
"config_melbandroformer_instvoc_duality.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_instvoc_duality.yaml"},
|
244 |
+
"Roformer Model: MelBand Roformer Kim | InstVoc Duality V2 by Unwa": {"melband_roformer_instvox_duality_v2.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-InstVoc-Duality/resolve/main/melband_roformer_instvox_duality_v2.ckpt",
|
245 |
+
"config_melbandroformer_instvoc_duality.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_instvoc_duality.yaml"},
|
246 |
+
"Roformer Model: MelBand Roformer Kim | Inst V1 (E) by Unwa": {"inst_v1e.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-Inst/resolve/main/inst_v1e.ckpt",
|
247 |
+
"config_melbandroformer_inst.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_inst.yaml"},
|
248 |
+
"Roformer Model: MelBand Roformer Kim": {"MelBandRoformer.ckpt":"https://huggingface.co/KimberleyJSN/melbandroformer/resolve/main/MelBandRoformer.ckpt",
|
249 |
+
"config_vocals_mel_band_roformer_kim.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_vocals_mel_band_roformer_kim.yaml"}
|
250 |
+
},
|
251 |
+
"mdx23c_download_vip_list": {
|
252 |
+
"MDX23C Model VIP: MDX23C_D1581": {"MDX23C_D1581.ckpt":"model_2_stem_061321.yaml"},
|
253 |
+
"MDX23C Model VIP: MDX23C-InstVoc HQ 2": {"MDX23C-8KFFT-InstVoc_HQ_2.ckpt":"model_2_stem_full_band_8k.yaml"}
|
254 |
+
},
|
255 |
+
|
256 |
+
"roll_back_win_url": "https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.6/UVR_v5.6.0_setup.exe",
|
257 |
+
"roll_back_macos_x86_64_url": "https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.6/Ultimate_Vocal_Remover_v5_6_MacOS_x86_64.dmg",
|
258 |
+
"roll_back_macos_arm64_url": "https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.6/Ultimate_Vocal_Remover_v5_6_MacOS_arm64.dmg",
|
259 |
+
|
260 |
+
"vr_download_vip_list": [],
|
261 |
+
"demucs_download_vip_list": []
|
262 |
+
}
|
misc/vocal_separator/mdx_model_data.json
ADDED
@@ -0,0 +1,415 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"0ddfc0eb5792638ad5dc27850236c246": {
|
3 |
+
"compensate": 1.035,
|
4 |
+
"mdx_dim_f_set": 2048,
|
5 |
+
"mdx_dim_t_set": 8,
|
6 |
+
"mdx_n_fft_scale_set": 6144,
|
7 |
+
"primary_stem": "Vocals"
|
8 |
+
},
|
9 |
+
"26d308f91f3423a67dc69a6d12a8793d": {
|
10 |
+
"compensate": 1.035,
|
11 |
+
"mdx_dim_f_set": 2048,
|
12 |
+
"mdx_dim_t_set": 9,
|
13 |
+
"mdx_n_fft_scale_set": 8192,
|
14 |
+
"primary_stem": "Other"
|
15 |
+
},
|
16 |
+
"2cdd429caac38f0194b133884160f2c6": {
|
17 |
+
"compensate": 1.045,
|
18 |
+
"mdx_dim_f_set": 3072,
|
19 |
+
"mdx_dim_t_set": 8,
|
20 |
+
"mdx_n_fft_scale_set": 7680,
|
21 |
+
"primary_stem": "Instrumental"
|
22 |
+
},
|
23 |
+
"2f5501189a2f6db6349916fabe8c90de": {
|
24 |
+
"compensate": 1.035,
|
25 |
+
"mdx_dim_f_set": 2048,
|
26 |
+
"mdx_dim_t_set": 8,
|
27 |
+
"mdx_n_fft_scale_set": 6144,
|
28 |
+
"primary_stem": "Vocals",
|
29 |
+
"is_karaoke": true
|
30 |
+
},
|
31 |
+
"398580b6d5d973af3120df54cee6759d": {
|
32 |
+
"compensate": 1.75,
|
33 |
+
"mdx_dim_f_set": 3072,
|
34 |
+
"mdx_dim_t_set": 8,
|
35 |
+
"mdx_n_fft_scale_set": 7680,
|
36 |
+
"primary_stem": "Vocals"
|
37 |
+
},
|
38 |
+
"488b3e6f8bd3717d9d7c428476be2d75": {
|
39 |
+
"compensate": 1.035,
|
40 |
+
"mdx_dim_f_set": 3072,
|
41 |
+
"mdx_dim_t_set": 8,
|
42 |
+
"mdx_n_fft_scale_set": 7680,
|
43 |
+
"primary_stem": "Instrumental"
|
44 |
+
},
|
45 |
+
"4910e7827f335048bdac11fa967772f9": {
|
46 |
+
"compensate": 1.035,
|
47 |
+
"mdx_dim_f_set": 2048,
|
48 |
+
"mdx_dim_t_set": 7,
|
49 |
+
"mdx_n_fft_scale_set": 4096,
|
50 |
+
"primary_stem": "Drums"
|
51 |
+
},
|
52 |
+
"53c4baf4d12c3e6c3831bb8f5b532b93": {
|
53 |
+
"compensate": 1.043,
|
54 |
+
"mdx_dim_f_set": 3072,
|
55 |
+
"mdx_dim_t_set": 8,
|
56 |
+
"mdx_n_fft_scale_set": 7680,
|
57 |
+
"primary_stem": "Vocals"
|
58 |
+
},
|
59 |
+
"5d343409ef0df48c7d78cce9f0106781": {
|
60 |
+
"compensate": 1.075,
|
61 |
+
"mdx_dim_f_set": 3072,
|
62 |
+
"mdx_dim_t_set": 8,
|
63 |
+
"mdx_n_fft_scale_set": 7680,
|
64 |
+
"primary_stem": "Vocals"
|
65 |
+
},
|
66 |
+
"5f6483271e1efb9bfb59e4a3e6d4d098": {
|
67 |
+
"compensate": 1.035,
|
68 |
+
"mdx_dim_f_set": 2048,
|
69 |
+
"mdx_dim_t_set": 9,
|
70 |
+
"mdx_n_fft_scale_set": 6144,
|
71 |
+
"primary_stem": "Vocals"
|
72 |
+
},
|
73 |
+
"65ab5919372a128e4167f5e01a8fda85": {
|
74 |
+
"compensate": 1.035,
|
75 |
+
"mdx_dim_f_set": 2048,
|
76 |
+
"mdx_dim_t_set": 8,
|
77 |
+
"mdx_n_fft_scale_set": 8192,
|
78 |
+
"primary_stem": "Other"
|
79 |
+
},
|
80 |
+
"6703e39f36f18aa7855ee1047765621d": {
|
81 |
+
"compensate": 1.035,
|
82 |
+
"mdx_dim_f_set": 2048,
|
83 |
+
"mdx_dim_t_set": 9,
|
84 |
+
"mdx_n_fft_scale_set": 16384,
|
85 |
+
"primary_stem": "Bass"
|
86 |
+
},
|
87 |
+
"6b31de20e84392859a3d09d43f089515": {
|
88 |
+
"compensate": 1.035,
|
89 |
+
"mdx_dim_f_set": 2048,
|
90 |
+
"mdx_dim_t_set": 8,
|
91 |
+
"mdx_n_fft_scale_set": 6144,
|
92 |
+
"primary_stem": "Vocals"
|
93 |
+
},
|
94 |
+
"867595e9de46f6ab699008295df62798": {
|
95 |
+
"compensate": 1.03,
|
96 |
+
"mdx_dim_f_set": 3072,
|
97 |
+
"mdx_dim_t_set": 8,
|
98 |
+
"mdx_n_fft_scale_set": 7680,
|
99 |
+
"primary_stem": "Vocals"
|
100 |
+
},
|
101 |
+
"a3cd63058945e777505c01d2507daf37": {
|
102 |
+
"compensate": 1.03,
|
103 |
+
"mdx_dim_f_set": 2048,
|
104 |
+
"mdx_dim_t_set": 8,
|
105 |
+
"mdx_n_fft_scale_set": 6144,
|
106 |
+
"primary_stem": "Vocals"
|
107 |
+
},
|
108 |
+
"b33d9b3950b6cbf5fe90a32608924700": {
|
109 |
+
"compensate": 1.03,
|
110 |
+
"mdx_dim_f_set": 3072,
|
111 |
+
"mdx_dim_t_set": 8,
|
112 |
+
"mdx_n_fft_scale_set": 7680,
|
113 |
+
"primary_stem": "Vocals"
|
114 |
+
},
|
115 |
+
"c3b29bdce8c4fa17ec609e16220330ab": {
|
116 |
+
"compensate": 1.035,
|
117 |
+
"mdx_dim_f_set": 2048,
|
118 |
+
"mdx_dim_t_set": 8,
|
119 |
+
"mdx_n_fft_scale_set": 16384,
|
120 |
+
"primary_stem": "Bass"
|
121 |
+
},
|
122 |
+
"ceed671467c1f64ebdfac8a2490d0d52": {
|
123 |
+
"compensate": 1.035,
|
124 |
+
"mdx_dim_f_set": 3072,
|
125 |
+
"mdx_dim_t_set": 8,
|
126 |
+
"mdx_n_fft_scale_set": 7680,
|
127 |
+
"primary_stem": "Instrumental"
|
128 |
+
},
|
129 |
+
"d2a1376f310e4f7fa37fb9b5774eb701": {
|
130 |
+
"compensate": 1.035,
|
131 |
+
"mdx_dim_f_set": 3072,
|
132 |
+
"mdx_dim_t_set": 8,
|
133 |
+
"mdx_n_fft_scale_set": 7680,
|
134 |
+
"primary_stem": "Instrumental"
|
135 |
+
},
|
136 |
+
"d7bff498db9324db933d913388cba6be": {
|
137 |
+
"compensate": 1.035,
|
138 |
+
"mdx_dim_f_set": 2048,
|
139 |
+
"mdx_dim_t_set": 8,
|
140 |
+
"mdx_n_fft_scale_set": 6144,
|
141 |
+
"primary_stem": "Vocals"
|
142 |
+
},
|
143 |
+
"d94058f8c7f1fae4164868ae8ae66b20": {
|
144 |
+
"compensate": 1.035,
|
145 |
+
"mdx_dim_f_set": 2048,
|
146 |
+
"mdx_dim_t_set": 8,
|
147 |
+
"mdx_n_fft_scale_set": 6144,
|
148 |
+
"primary_stem": "Vocals"
|
149 |
+
},
|
150 |
+
"dc41ede5961d50f277eb846db17f5319": {
|
151 |
+
"compensate": 1.035,
|
152 |
+
"mdx_dim_f_set": 2048,
|
153 |
+
"mdx_dim_t_set": 9,
|
154 |
+
"mdx_n_fft_scale_set": 4096,
|
155 |
+
"primary_stem": "Drums"
|
156 |
+
},
|
157 |
+
"e5572e58abf111f80d8241d2e44e7fa4": {
|
158 |
+
"compensate": 1.028,
|
159 |
+
"mdx_dim_f_set": 3072,
|
160 |
+
"mdx_dim_t_set": 8,
|
161 |
+
"mdx_n_fft_scale_set": 7680,
|
162 |
+
"primary_stem": "Instrumental"
|
163 |
+
},
|
164 |
+
"e7324c873b1f615c35c1967f912db92a": {
|
165 |
+
"compensate": 1.03,
|
166 |
+
"mdx_dim_f_set": 3072,
|
167 |
+
"mdx_dim_t_set": 8,
|
168 |
+
"mdx_n_fft_scale_set": 7680,
|
169 |
+
"primary_stem": "Vocals"
|
170 |
+
},
|
171 |
+
"1c56ec0224f1d559c42fd6fd2a67b154": {
|
172 |
+
"compensate": 1.025,
|
173 |
+
"mdx_dim_f_set": 2048,
|
174 |
+
"mdx_dim_t_set": 8,
|
175 |
+
"mdx_n_fft_scale_set": 5120,
|
176 |
+
"primary_stem": "Instrumental"
|
177 |
+
},
|
178 |
+
"f2df6d6863d8f435436d8b561594ff49": {
|
179 |
+
"compensate": 1.035,
|
180 |
+
"mdx_dim_f_set": 3072,
|
181 |
+
"mdx_dim_t_set": 8,
|
182 |
+
"mdx_n_fft_scale_set": 7680,
|
183 |
+
"primary_stem": "Instrumental"
|
184 |
+
},
|
185 |
+
"b06327a00d5e5fbc7d96e1781bbdb596": {
|
186 |
+
"compensate": 1.035,
|
187 |
+
"mdx_dim_f_set": 3072,
|
188 |
+
"mdx_dim_t_set": 8,
|
189 |
+
"mdx_n_fft_scale_set": 6144,
|
190 |
+
"primary_stem": "Instrumental"
|
191 |
+
},
|
192 |
+
"94ff780b977d3ca07c7a343dab2e25dd": {
|
193 |
+
"compensate": 1.039,
|
194 |
+
"mdx_dim_f_set": 3072,
|
195 |
+
"mdx_dim_t_set": 8,
|
196 |
+
"mdx_n_fft_scale_set": 6144,
|
197 |
+
"primary_stem": "Instrumental"
|
198 |
+
},
|
199 |
+
"73492b58195c3b52d34590d5474452f6": {
|
200 |
+
"compensate": 1.043,
|
201 |
+
"mdx_dim_f_set": 3072,
|
202 |
+
"mdx_dim_t_set": 8,
|
203 |
+
"mdx_n_fft_scale_set": 7680,
|
204 |
+
"primary_stem": "Vocals"
|
205 |
+
},
|
206 |
+
"970b3f9492014d18fefeedfe4773cb42": {
|
207 |
+
"compensate": 1.009,
|
208 |
+
"mdx_dim_f_set": 3072,
|
209 |
+
"mdx_dim_t_set": 8,
|
210 |
+
"mdx_n_fft_scale_set": 7680,
|
211 |
+
"primary_stem": "Vocals"
|
212 |
+
},
|
213 |
+
"1d64a6d2c30f709b8c9b4ce1366d96ee": {
|
214 |
+
"compensate": 1.065,
|
215 |
+
"mdx_dim_f_set": 2048,
|
216 |
+
"mdx_dim_t_set": 8,
|
217 |
+
"mdx_n_fft_scale_set": 5120,
|
218 |
+
"primary_stem": "Instrumental",
|
219 |
+
"is_karaoke": true
|
220 |
+
},
|
221 |
+
"203f2a3955221b64df85a41af87cf8f0": {
|
222 |
+
"compensate": 1.035,
|
223 |
+
"mdx_dim_f_set": 3072,
|
224 |
+
"mdx_dim_t_set": 8,
|
225 |
+
"mdx_n_fft_scale_set": 6144,
|
226 |
+
"primary_stem": "Instrumental"
|
227 |
+
},
|
228 |
+
"291c2049608edb52648b96e27eb80e95": {
|
229 |
+
"compensate": 1.035,
|
230 |
+
"mdx_dim_f_set": 3072,
|
231 |
+
"mdx_dim_t_set": 8,
|
232 |
+
"mdx_n_fft_scale_set": 6144,
|
233 |
+
"primary_stem": "Instrumental"
|
234 |
+
},
|
235 |
+
"ead8d05dab12ec571d67549b3aab03fc": {
|
236 |
+
"compensate": 1.035,
|
237 |
+
"mdx_dim_f_set": 3072,
|
238 |
+
"mdx_dim_t_set": 8,
|
239 |
+
"mdx_n_fft_scale_set": 6144,
|
240 |
+
"primary_stem": "Instrumental"
|
241 |
+
},
|
242 |
+
"cc63408db3d80b4d85b0287d1d7c9632": {
|
243 |
+
"compensate": 1.033,
|
244 |
+
"mdx_dim_f_set": 3072,
|
245 |
+
"mdx_dim_t_set": 8,
|
246 |
+
"mdx_n_fft_scale_set": 6144,
|
247 |
+
"primary_stem": "Instrumental"
|
248 |
+
},
|
249 |
+
"cd5b2989ad863f116c855db1dfe24e39": {
|
250 |
+
"compensate": 1.035,
|
251 |
+
"mdx_dim_f_set": 3072,
|
252 |
+
"mdx_dim_t_set": 9,
|
253 |
+
"mdx_n_fft_scale_set": 6144,
|
254 |
+
"primary_stem": "Reverb"
|
255 |
+
},
|
256 |
+
"55657dd70583b0fedfba5f67df11d711": {
|
257 |
+
"compensate": 1.022,
|
258 |
+
"mdx_dim_f_set": 3072,
|
259 |
+
"mdx_dim_t_set": 8,
|
260 |
+
"mdx_n_fft_scale_set": 6144,
|
261 |
+
"primary_stem": "Instrumental"
|
262 |
+
},
|
263 |
+
"b6bccda408a436db8500083ef3491e8b": {
|
264 |
+
"compensate": 1.02,
|
265 |
+
"mdx_dim_f_set": 3072,
|
266 |
+
"mdx_dim_t_set": 8,
|
267 |
+
"mdx_n_fft_scale_set": 7680,
|
268 |
+
"primary_stem": "Instrumental"
|
269 |
+
},
|
270 |
+
"8a88db95c7fb5dbe6a095ff2ffb428b1": {
|
271 |
+
"compensate": 1.026,
|
272 |
+
"mdx_dim_f_set": 2048,
|
273 |
+
"mdx_dim_t_set": 8,
|
274 |
+
"mdx_n_fft_scale_set": 5120,
|
275 |
+
"primary_stem": "Instrumental"
|
276 |
+
},
|
277 |
+
"b78da4afc6512f98e4756f5977f5c6b9": {
|
278 |
+
"compensate": 1.021,
|
279 |
+
"mdx_dim_f_set": 3072,
|
280 |
+
"mdx_dim_t_set": 8,
|
281 |
+
"mdx_n_fft_scale_set": 7680,
|
282 |
+
"primary_stem": "Instrumental"
|
283 |
+
},
|
284 |
+
"77d07b2667ddf05b9e3175941b4454a0": {
|
285 |
+
"compensate": 1.021,
|
286 |
+
"mdx_dim_f_set": 3072,
|
287 |
+
"mdx_dim_t_set": 8,
|
288 |
+
"mdx_n_fft_scale_set": 7680,
|
289 |
+
"primary_stem": "Vocals"
|
290 |
+
},
|
291 |
+
"0f2a6bc5b49d87d64728ee40e23bceb1": {
|
292 |
+
"compensate": 1.019,
|
293 |
+
"mdx_dim_f_set": 2560,
|
294 |
+
"mdx_dim_t_set": 8,
|
295 |
+
"mdx_n_fft_scale_set": 5120,
|
296 |
+
"primary_stem": "Instrumental"
|
297 |
+
},
|
298 |
+
"cb790d0c913647ced70fc6b38f5bea1a": {
|
299 |
+
"compensate": 1.010,
|
300 |
+
"mdx_dim_f_set": 2560,
|
301 |
+
"mdx_dim_t_set": 8,
|
302 |
+
"mdx_n_fft_scale_set": 5120,
|
303 |
+
"primary_stem": "Instrumental"
|
304 |
+
},
|
305 |
+
"b02be2d198d4968a121030cf8950b492": {
|
306 |
+
"compensate": 1.020,
|
307 |
+
"mdx_dim_f_set": 2560,
|
308 |
+
"mdx_dim_t_set": 8,
|
309 |
+
"mdx_n_fft_scale_set": 5120,
|
310 |
+
"primary_stem": "No Crowd"
|
311 |
+
},
|
312 |
+
"2154254ee89b2945b97a7efed6e88820": {
|
313 |
+
"config_yaml": "model_2_stem_061321.yaml"
|
314 |
+
},
|
315 |
+
"063aadd735d58150722926dcbf5852a9": {
|
316 |
+
"config_yaml": "model_2_stem_061321.yaml"
|
317 |
+
},
|
318 |
+
"c09f714d978b41d718facfe3427e6001": {
|
319 |
+
"config_yaml": "model_2_stem_061321.yaml"
|
320 |
+
},
|
321 |
+
"fe96801369f6a148df2720f5ced88c19": {
|
322 |
+
"config_yaml": "model3.yaml"
|
323 |
+
},
|
324 |
+
"02e8b226f85fb566e5db894b9931c640": {
|
325 |
+
"config_yaml": "model2.yaml"
|
326 |
+
},
|
327 |
+
"e3de6d861635ab9c1d766149edd680d6": {
|
328 |
+
"config_yaml": "model1.yaml"
|
329 |
+
},
|
330 |
+
"3f2936c554ab73ce2e396d54636bd373": {
|
331 |
+
"config_yaml": "modelB.yaml"
|
332 |
+
},
|
333 |
+
"890d0f6f82d7574bca741a9e8bcb8168": {
|
334 |
+
"config_yaml": "modelB.yaml"
|
335 |
+
},
|
336 |
+
"63a3cb8c37c474681049be4ad1ba8815": {
|
337 |
+
"config_yaml": "modelB.yaml"
|
338 |
+
},
|
339 |
+
"a7fc5d719743c7fd6b61bd2b4d48b9f0": {
|
340 |
+
"config_yaml": "modelA.yaml"
|
341 |
+
},
|
342 |
+
"3567f3dee6e77bf366fcb1c7b8bc3745": {
|
343 |
+
"config_yaml": "modelA.yaml"
|
344 |
+
},
|
345 |
+
"a28f4d717bd0d34cd2ff7a3b0a3d065e": {
|
346 |
+
"config_yaml": "modelA.yaml"
|
347 |
+
},
|
348 |
+
"c9971a18da20911822593dc81caa8be9": {
|
349 |
+
"config_yaml": "sndfx.yaml"
|
350 |
+
},
|
351 |
+
"57d94d5ed705460d21c75a5ac829a605": {
|
352 |
+
"config_yaml": "sndfx.yaml"
|
353 |
+
},
|
354 |
+
"e7a25f8764f25a52c1b96c4946e66ba2": {
|
355 |
+
"config_yaml": "sndfx.yaml"
|
356 |
+
},
|
357 |
+
"104081d24e37217086ce5fde09147ee1": {
|
358 |
+
"config_yaml": "model_2_stem_061321.yaml"
|
359 |
+
},
|
360 |
+
"1e6165b601539f38d0a9330f3facffeb": {
|
361 |
+
"config_yaml": "model_2_stem_061321.yaml"
|
362 |
+
},
|
363 |
+
"fe0108464ce0d8271be5ab810891bd7c": {
|
364 |
+
"config_yaml": "model_2_stem_full_band.yaml"
|
365 |
+
},
|
366 |
+
"e9b82ec90ee56c507a3a982f1555714c": {
|
367 |
+
"config_yaml": "model_2_stem_full_band_2.yaml"
|
368 |
+
},
|
369 |
+
"99b6ceaae542265a3b6d657bf9fde79f": {
|
370 |
+
"config_yaml": "model_2_stem_full_band_8k.yaml"
|
371 |
+
},
|
372 |
+
"116f6f9dabb907b53d847ed9f7a9475f": {
|
373 |
+
"config_yaml": "model_2_stem_full_band_8k.yaml"
|
374 |
+
},
|
375 |
+
"53f707017bfcbb56f5e1bfac420d6732": {
|
376 |
+
"config_yaml": "model_bs_roformer_ep_317_sdr_12.9755.yaml",
|
377 |
+
"is_roformer": true
|
378 |
+
},
|
379 |
+
"63e41acc264bf681a73aa9f7e5f606cc": {
|
380 |
+
"config_yaml": "model_mel_band_roformer_ep_3005_sdr_11.4360.yaml",
|
381 |
+
"is_roformer": true
|
382 |
+
},
|
383 |
+
"e733736763234047587931fc35322fd9": {
|
384 |
+
"config_yaml": "model_bs_roformer_ep_937_sdr_10.5309.yaml",
|
385 |
+
"is_roformer": true
|
386 |
+
},
|
387 |
+
"d7a256bee3e7c620f554bceaab2f68f6": {
|
388 |
+
"config_yaml": "config_melbandroformer_inst.yaml",
|
389 |
+
"is_roformer": true
|
390 |
+
},
|
391 |
+
"365ccfa0e04b31ac2e24bbb935142a81": {
|
392 |
+
"config_yaml": "config_melbandroformer_inst.yaml",
|
393 |
+
"is_roformer": true
|
394 |
+
},
|
395 |
+
"3c15abf122d8eccc4a0eb97bf84a3e58": {
|
396 |
+
"config_yaml": "config_melbandroformer_instvoc_duality.yaml",
|
397 |
+
"is_roformer": true
|
398 |
+
},
|
399 |
+
"9fb197af219c5172ea38703a33aceb79": {
|
400 |
+
"config_yaml": "config_melbandroformer_instvoc_duality.yaml",
|
401 |
+
"is_roformer": true
|
402 |
+
},
|
403 |
+
"d789065adfd747d6f585b27b495bcdae": {
|
404 |
+
"config_yaml": "model_bs_roformer_ep_368_sdr_12.9628.yaml",
|
405 |
+
"is_roformer": true
|
406 |
+
},
|
407 |
+
"e4ca75912fcff3224a19058e55facfbf": {
|
408 |
+
"config_yaml": "config_vocals_mel_band_roformer_kim.yaml",
|
409 |
+
"is_roformer": true
|
410 |
+
},
|
411 |
+
"951f8ef420a941a395a9919f5d55cce9": {
|
412 |
+
"config_yaml": "config_melbandroformer_inst_v2.yaml",
|
413 |
+
"is_roformer": true
|
414 |
+
}
|
415 |
+
}
|
misc/vocal_separator/vr_model_data.json
ADDED
@@ -0,0 +1,137 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"0d0e6d143046b0eecc41a22e60224582": {
|
3 |
+
"vr_model_param": "3band_44100_mid",
|
4 |
+
"primary_stem": "Instrumental"
|
5 |
+
},
|
6 |
+
"18b52f873021a0af556fb4ecd552bb8e": {
|
7 |
+
"vr_model_param": "2band_32000",
|
8 |
+
"primary_stem": "Instrumental"
|
9 |
+
},
|
10 |
+
"1fc66027c82b499c7d8f55f79e64cadc": {
|
11 |
+
"vr_model_param": "2band_32000",
|
12 |
+
"primary_stem": "Instrumental"
|
13 |
+
},
|
14 |
+
"2aa34fbc01f8e6d2bf509726481e7142": {
|
15 |
+
"vr_model_param": "4band_44100",
|
16 |
+
"primary_stem": "No Piano"
|
17 |
+
},
|
18 |
+
"3e18f639b11abea7361db1a4a91c2559": {
|
19 |
+
"vr_model_param": "4band_44100",
|
20 |
+
"primary_stem": "Instrumental"
|
21 |
+
},
|
22 |
+
"570b5f50054609a17741369a35007ddd": {
|
23 |
+
"vr_model_param": "4band_v3",
|
24 |
+
"primary_stem": "Instrumental"
|
25 |
+
},
|
26 |
+
"5a6e24c1b530f2dab045a522ef89b751": {
|
27 |
+
"vr_model_param": "1band_sr44100_hl512",
|
28 |
+
"primary_stem": "Instrumental"
|
29 |
+
},
|
30 |
+
"6b5916069a49be3fe29d4397ecfd73fa": {
|
31 |
+
"vr_model_param": "3band_44100_msb2",
|
32 |
+
"primary_stem": "Instrumental",
|
33 |
+
"is_karaoke": true
|
34 |
+
},
|
35 |
+
"74b3bc5fa2b69f29baf7839b858bc679": {
|
36 |
+
"vr_model_param": "4band_44100",
|
37 |
+
"primary_stem": "Instrumental"
|
38 |
+
},
|
39 |
+
"827213b316df36b52a1f3d04fec89369": {
|
40 |
+
"vr_model_param": "4band_44100",
|
41 |
+
"primary_stem": "Instrumental"
|
42 |
+
},
|
43 |
+
"911d4048eee7223eca4ee0efb7d29256": {
|
44 |
+
"vr_model_param": "4band_44100",
|
45 |
+
"primary_stem": "Vocals"
|
46 |
+
},
|
47 |
+
"941f3f7f0b0341f12087aacdfef644b1": {
|
48 |
+
"vr_model_param": "4band_v2",
|
49 |
+
"primary_stem": "Instrumental"
|
50 |
+
},
|
51 |
+
"a02827cf69d75781a35c0e8a327f3195": {
|
52 |
+
"vr_model_param": "1band_sr33075_hl384",
|
53 |
+
"primary_stem": "Instrumental"
|
54 |
+
},
|
55 |
+
"b165fbff113c959dba5303b74c6484bc": {
|
56 |
+
"vr_model_param": "3band_44100",
|
57 |
+
"primary_stem": "Instrumental"
|
58 |
+
},
|
59 |
+
"b5f988cd3e891dca7253bf5f0f3427c7": {
|
60 |
+
"vr_model_param": "4band_44100",
|
61 |
+
"primary_stem": "Instrumental"
|
62 |
+
},
|
63 |
+
"b99c35723bc35cb11ed14a4780006a80": {
|
64 |
+
"vr_model_param": "1band_sr44100_hl1024",
|
65 |
+
"primary_stem": "Instrumental"
|
66 |
+
},
|
67 |
+
"ba02fd25b71d620eebbdb49e18e4c336": {
|
68 |
+
"vr_model_param": "3band_44100_mid",
|
69 |
+
"primary_stem": "Instrumental"
|
70 |
+
},
|
71 |
+
"c4476ef424d8cba65f38d8d04e8514e2": {
|
72 |
+
"vr_model_param": "3band_44100_msb2",
|
73 |
+
"primary_stem": "Instrumental"
|
74 |
+
},
|
75 |
+
"da2d37b8be2972e550a409bae08335aa": {
|
76 |
+
"vr_model_param": "4band_44100",
|
77 |
+
"primary_stem": "Vocals"
|
78 |
+
},
|
79 |
+
"db57205d3133e39df8e050b435a78c80": {
|
80 |
+
"vr_model_param": "4band_44100",
|
81 |
+
"primary_stem": "Instrumental"
|
82 |
+
},
|
83 |
+
"ea83b08e32ec2303456fe50659035f69": {
|
84 |
+
"vr_model_param": "4band_v3",
|
85 |
+
"primary_stem": "Instrumental"
|
86 |
+
},
|
87 |
+
"f6ea8473ff86017b5ebd586ccacf156b": {
|
88 |
+
"vr_model_param": "4band_v2_sn",
|
89 |
+
"primary_stem": "Instrumental",
|
90 |
+
"is_karaoke": true
|
91 |
+
},
|
92 |
+
"fd297a61eafc9d829033f8b987c39a3d": {
|
93 |
+
"vr_model_param": "1band_sr32000_hl512",
|
94 |
+
"primary_stem": "Instrumental"
|
95 |
+
},
|
96 |
+
"0ec76fd9e65f81d8b4fbd13af4826ed8": {
|
97 |
+
"vr_model_param": "4band_v3",
|
98 |
+
"primary_stem": "No Woodwinds"
|
99 |
+
},
|
100 |
+
"0fb9249ffe4ffc38d7b16243f394c0ff": {
|
101 |
+
"vr_model_param": "4band_v3",
|
102 |
+
"primary_stem": "No Reverb"
|
103 |
+
},
|
104 |
+
"6857b2972e1754913aad0c9a1678c753": {
|
105 |
+
"vr_model_param": "4band_v3",
|
106 |
+
"primary_stem": "No Echo",
|
107 |
+
"nout": 48,
|
108 |
+
"nout_lstm": 128
|
109 |
+
},
|
110 |
+
"f200a145434efc7dcf0cd093f517ed52": {
|
111 |
+
"vr_model_param": "4band_v3",
|
112 |
+
"primary_stem": "No Echo",
|
113 |
+
"nout": 48,
|
114 |
+
"nout_lstm": 128
|
115 |
+
},
|
116 |
+
"44c55d8b5d2e3edea98c2b2bf93071c7": {
|
117 |
+
"vr_model_param": "4band_v3",
|
118 |
+
"primary_stem": "Noise",
|
119 |
+
"nout": 48,
|
120 |
+
"nout_lstm": 128
|
121 |
+
},
|
122 |
+
"51ea8c43a6928ed3c10ef5cb2707d57b": {
|
123 |
+
"vr_model_param": "1band_sr44100_hl1024",
|
124 |
+
"primary_stem": "Noise",
|
125 |
+
"nout": 16,
|
126 |
+
"nout_lstm": 128
|
127 |
+
},
|
128 |
+
"944950a9c5963a5eb70b445d67b7068a": {
|
129 |
+
"vr_model_param": "4band_v3_sn",
|
130 |
+
"primary_stem": "Vocals",
|
131 |
+
"nout": 64,
|
132 |
+
"nout_lstm": 128,
|
133 |
+
"is_karaoke": false,
|
134 |
+
"is_bv_model": true,
|
135 |
+
"is_bv_model_rebalanced": 0.9
|
136 |
+
}
|
137 |
+
}
|
reference_net/config.json
ADDED
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_center_input_sample": false,
|
3 |
+
"_class_name": "UNet2DConditionModel",
|
4 |
+
"_diffusers_version": "0.31.0",
|
5 |
+
"_out_channels": 4,
|
6 |
+
"act_fn": "silu",
|
7 |
+
"addition_embed_type": null,
|
8 |
+
"addition_embed_type_num_heads": 64,
|
9 |
+
"addition_time_embed_dim": null,
|
10 |
+
"attention_head_dim": 8,
|
11 |
+
"attention_type": "default",
|
12 |
+
"block_out_channels": [
|
13 |
+
320,
|
14 |
+
640,
|
15 |
+
1280,
|
16 |
+
1280
|
17 |
+
],
|
18 |
+
"center_input_sample": false,
|
19 |
+
"class_embed_type": null,
|
20 |
+
"class_embeddings_concat": false,
|
21 |
+
"conv_in_kernel": 3,
|
22 |
+
"cross_attention_dim": 768,
|
23 |
+
"down_block_types": [
|
24 |
+
"CrossAttnDownBlock2D",
|
25 |
+
"CrossAttnDownBlock2D",
|
26 |
+
"CrossAttnDownBlock2D",
|
27 |
+
"DownBlock2D"
|
28 |
+
],
|
29 |
+
"downsample_padding": 1,
|
30 |
+
"dropout": 0.0,
|
31 |
+
"dual_cross_attention": false,
|
32 |
+
"encoder_hid_dim": null,
|
33 |
+
"encoder_hid_dim_type": null,
|
34 |
+
"flip_sin_to_cos": true,
|
35 |
+
"freq_shift": 0,
|
36 |
+
"in_channels": 4,
|
37 |
+
"layers_per_block": 2,
|
38 |
+
"mid_block_only_cross_attention": null,
|
39 |
+
"mid_block_scale_factor": 1,
|
40 |
+
"mid_block_type": "UNetMidBlock2DCrossAttn",
|
41 |
+
"norm_eps": 1e-05,
|
42 |
+
"norm_num_groups": 32,
|
43 |
+
"num_attention_heads": null,
|
44 |
+
"num_class_embeds": null,
|
45 |
+
"only_cross_attention": false,
|
46 |
+
"out_channels": 4,
|
47 |
+
"projection_class_embeddings_input_dim": null,
|
48 |
+
"resnet_time_scale_shift": "default",
|
49 |
+
"reverse_transformer_layers_per_block": null,
|
50 |
+
"sample_size": 64,
|
51 |
+
"time_cond_proj_dim": null,
|
52 |
+
"time_embedding_act_fn": null,
|
53 |
+
"time_embedding_dim": null,
|
54 |
+
"time_embedding_type": "positional",
|
55 |
+
"timestep_post_act": null,
|
56 |
+
"transformer_layers_per_block": 1,
|
57 |
+
"up_block_types": [
|
58 |
+
"UpBlock2D",
|
59 |
+
"CrossAttnUpBlock2D",
|
60 |
+
"CrossAttnUpBlock2D",
|
61 |
+
"CrossAttnUpBlock2D"
|
62 |
+
],
|
63 |
+
"upcast_attention": false,
|
64 |
+
"use_linear_projection": false
|
65 |
+
}
|
reference_net/diffusion_pytorch_model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:064447cf3e66fe1cc5812aa4b5e88716ebb46f27fdf6dff146f0e82469da5537
|
3 |
+
size 3428346912
|