--- license: apache-2.0 language: - mk library_name: speechbrain pipeline_tag: automatic-speech-recognition base_model: - jonatasgrosman/wav2vec2-large-xlsr-53-russian model-index: - name: wav2vec2-aed-macedonian-asr results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Macedonian Common Voice V.18.0 type: macedonian-common-voice-v.18.0 metrics: - name: Test WER type: test-wer value: 5.66 - name: Test CER type: test-cer value: 1.43 --- # Fine-tuned XLSR-53-russian large model for speech recognition in Macedonian Authors: 1. Dejan Porjazovski 2. Ilina Jakimovska 3. Ordan Chukaliev 4. Nikola Stikov This collaboration is part of the activities of the Center for Advanced Interdisciplinary Research (CAIR) at UKIM. ## Model description This model is an attention-based encoder-decoder (AED). The encoder is a Wav2vec2 model and the decoder is RNN-based. ## Usage The model is developed using the [SpeechBrain](https://speechbrain.github.io) toolkit. To use it, you need to install SpeechBrain with: ``` pip install speechbrain ``` SpeechBrain relies on the Transformers library, therefore you need install it: ``` pip install transformers ``` An external `py_module_file=custom_interface.py` is used as an external Predictor class into this HF repos. We use the `foreign_class` function from `speechbrain.pretrained.interfaces` that allows you to load your custom model. ```python from speechbrain.inference.interfaces import foreign_class device = torch.device("cuda" if torch.cuda.is_available() else "cpu") asr_classifier = foreign_class(source="Macedonian-ASR/buki-wav2vec2-2.0", pymodule_file="custom_interface_app.py", classname="ASR") asr_classifier = asr_classifier.to(device) predictions = asr_classifier.classify_file("audio_file.wav", device) print(predictions) ``` ## Training To fine-tune this model, you need to run: ``` python train.py hyperparams.yaml ``` ```train.py``` file contains the functions necessary for training the model and ```hyperparams.yaml``` contains the hyperparameters. For more details about training the model, refer to the [SpeechBrain](https://speechbrain.github.io) documentation.