|
--- |
|
language: |
|
- sv-SE |
|
license: cc0-1.0 |
|
tags: |
|
- automatic-speech-recognition |
|
- mozilla-foundation/common_voice_8_0 |
|
- generated_from_trainer |
|
- sv |
|
- robust-speech-event |
|
- model_for_talk |
|
datasets: |
|
- mozilla-foundation/common_voice_8_0 |
|
- marinone94/nst_sv |
|
model-index: |
|
- name: XLS-R-300M - Swedish |
|
results: |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
dataset: |
|
name: mozilla-foundation/common_voice_8_0 |
|
type: mozilla-foundation/common_voice_8_0 |
|
args: sv-SE |
|
metrics: |
|
- name: Test WER |
|
type: wer |
|
value: 16.98 |
|
- name: Test CER |
|
type: cer |
|
value: 5.66 |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
dataset: |
|
name: speech-recognition-community-v2/dev_data |
|
type: speech-recognition-community-v2/dev_data |
|
args: sv |
|
metrics: |
|
- name: Test WER |
|
type: wer |
|
value: 27.01 |
|
- name: Test CER |
|
type: cer |
|
value: 13.14 |
|
--- |
|
|
|
This model is a fine-tuned version of [KBLab/wav2vec2-large-voxrex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) on 2 epochs of the MARINONE94/NST_SV - SV dataset (80% random split with seed 42 as the dataset for now has only the "train" split), and then on 50 epochs of the the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - SV-SE dataset ("train+validation" split). |
|
See run.sh to have a complete overview of all the training steps. |
|
NOTE: the first training for now didn't work as expected, so it might be useless or even degrade performance. Further investigation and development is needed. |