MSA-ASR
Multilingual Speaker-Attributed Automatic Speech Recognition
Introduction
This repository provides an implementation of a Speaker-Attributed Automatic Speech Recognition model. The model performs both multilingual speech recognition and speaker embedding extraction, enabling speaker differentiation.
Model architecture
Setup
git clone git@github.com:nguyenvulebinh/MSA-ASR.git
cd MSA-ASR
conda create -n MSA-ASR python=3.10
conda activate MSA-ASR
pip install -r requirements.txt
Test script:
python infer.py
Citation
@misc{nguyen2025msaasrefficientmultilingualspeaker,
title={MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models},
author={Thai-Binh Nguyen and Alexander Waibel},
year={2025},
eprint={2411.18152},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2411.18152},
}
License
CC-BY-NC 4.0
Contact
Contributions are welcome; feel free to create a PR or email me:
[Binh Nguyen](nguyenvulebinh[at]gmail.com)
- Downloads last month
- 7
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.