language: | |
- da | |
license: other | |
datasets: | |
- ftspeech | |
metrics: | |
- wer | |
tasks: | |
- automatic-speech-recognition | |
base_model: facebook/wav2vec2-xls-r-300m | |
model-index: | |
- name: wav2vec2-xls-r-300m-ftspeech | |
results: | |
- task: | |
type: automatic-speech-recognition | |
dataset: | |
name: Danish Common Voice 8.0 | |
type: mozilla-foundation/common_voice_8_0 | |
args: da | |
metrics: | |
- type: wer | |
value: 17.91 | |
- task: | |
type: automatic-speech-recognition | |
dataset: | |
name: Alvenir ASR test dataset | |
type: Alvenir/alvenir_asr_da_eval | |
metrics: | |
- type: wer | |
value: 13.84 | |
# XLS-R-300m-FTSpeech | |
## Model description | |
This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the [FTSpeech dataset](https://ftspeech.github.io/), being a dataset of 1,800 hours of transcribed speeches from the Danish parliament. | |
## Performance | |
The model achieves the following WER scores (lower is better): | |
| **Dataset** | **WER without LM** | **WER with 5-gram LM** | | |
| :---: | ---: | ---: | | |
| [Danish part of Common Voice 8.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0/viewer/da/train) | 20.48 | 17.91 | | |
| [Alvenir test set](https://huggingface.co/datasets/Alvenir/alvenir_asr_da_eval) | 15.46 | 13.84 | | |
## License | |
The use of this model needs to adhere to [this license from the Danish Parliament](https://www.ft.dk/da/aktuelt/tv-fra-folketinget/deling-og-rettigheder). |