Cadenza Challenge: CAD2-Task2

A Causal separation model for the CAD2-Task2 system.

This model is an ensemble of the following instruments:

Bassoon
Cello
Clarinet
Flute
Oboe
Sax
Viola
Violin

Each model is based on the ConvTasNet (Kaituo XU) with multichannel support (Alexandre Defossez).

Parameters:
- B: 256
- C: 2
- H: 512
- L: 20
- N: 256
- P: 3
- R: 3
- X: 8
- audio_channels: 2
- causal: true
- mask_nonlinear: relu
- norm_type: cLN

Dataset

The model was trained using EnsembleSet and CadenzaWoodwind datasets.

How to use

from dynamic_source_separator import DynamicSourceSeparator

model = DynamicSourceSeparator.from_pretrained(
    "cadenzachallenge/Dynamic_Source_Separator_Causal"
).cpu()

Description

Audio source separation model used in Sytem T002 for Cadenza2 Task2 Challenge

The model is a finetune of the 8 ConvTasNet models from the Task2 baseline. The training optimised the estimated sources and the recosntructed mixture

$Loss = \sum_{}^{Sources}(L_1(estimated~source, ref~source)) + L_1(reconstructed~mixture, original~mixture)$

def dynamic_masked_loss(mixture, separated_sources, ground_truth_sources, indicator):
    # Reconstruction Loss
    reconstruction = sum(separated_sources.values())
    reconstruction_loss = nn.L1Loss()(reconstruction, mixture)
    # Separation Loss
    separation_loss = 0
    for instrument, active in indicator.items():
        if active:
            separation_loss += nn.L1Loss()(
                separated_sources[instrument], ground_truth_sources[instrument]
            )
    return reconstruction_loss + separation_loss

Model and T002 recipe are shared in Clarity toolkit