marinone94
/

xls-r-300m-sv-robust

Automatic Speech Recognition

mozilla-foundation/common_voice_9_0

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

xls-r-300m-sv-robust / README_TEMPLATE.md

marinone94's picture

run eval. update readme.

f165be0 over 2 years ago

|

1.63 kB

	---
	language:
	- sv-SE
	license: cc0-1.0
	tags:
	- automatic-speech-recognition
	- mozilla-foundation/common_voice_8_0
	- generated_from_trainer
	- sv
	- robust-speech-event
	- model_for_talk
	datasets:
	- mozilla-foundation/common_voice_8_0
	- marinone94/nst_sv
	model-index:
	- name: XLS-R-300M - Swedish
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: mozilla-foundation/common_voice_8_0
	type: mozilla-foundation/common_voice_8_0
	args: sv-SE
	metrics:
	- name: Test WER
	type: wer
	value: 16.98
	- name: Test CER
	type: cer
	value: 5.66
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: speech-recognition-community-v2/dev_data
	type: speech-recognition-community-v2/dev_data
	args: sv
	metrics:
	- name: Test WER
	type: wer
	value: 27.01
	- name: Test CER
	type: cer
	value: 13.14
	---

	This model is a fine-tuned version of [KBLab/wav2vec2-large-voxrex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) on 2 epochs of the MARINONE94/NST_SV - SV dataset (80% random split with seed 42 as the dataset for now has only the "train" split), and then on 50 epochs of the the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - SV-SE dataset ("train+validation" split).
	See run.sh to have a complete overview of all the training steps.
	NOTE: the first training for now didn't work as expected, so it might be useless or even degrade performance. Further investigation and development is needed.