alibabasglab
/

AV_MossFormer2_TSE_16K

Model card Files Files and versions Community

alibabasglab commited on Nov 25, 2024

Commit

59fb133

•

1 Parent(s): 2ef614a

Update README.md

Files changed (1) hide show

README.md +9 -3

README.md CHANGED Viewed

@@ -1,3 +1,9 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+The AV_MossFormer2_TSE_16K model weights for 16 kHz audio-visual target speaker extraction in [ClearerVoice-Studio](https://github.com/modelscope/ClearerVoice-Studio/tree/main) repo.
+This model is trained on large scale open-sourced datasets.
+It extracts each speaker's voice from a multi-speaker video using facial recognition.