You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

MSA-ASR

Multilingual Speaker-Attributed Automatic Speech Recognition

Introduction

This repository provides an implementation of a Speaker-Attributed Automatic Speech Recognition model. The model performs both multilingual speech recognition and speaker embedding extraction, enabling speaker differentiation.

Model architecture

MSA-ASR Model

Setup

git clone git@github.com:nguyenvulebinh/MSA-ASR.git
cd MSA-ASR
conda create -n MSA-ASR python=3.10
conda activate MSA-ASR
pip install -r requirements.txt

Test script:

python infer.py

Citation

@misc{nguyen2025msaasrefficientmultilingualspeaker,
      title={MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models}, 
      author={Thai-Binh Nguyen and Alexander Waibel},
      year={2025},
      eprint={2411.18152},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.18152}, 
}

License

CC-BY-NC 4.0

Contact

Contributions are welcome; feel free to create a PR or email me:

[Binh Nguyen](nguyenvulebinh[at]gmail.com)
Downloads last month
7
Safetensors
Model size
625M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.