--- language: - sv license: apache-2.0 tags: - hf-asr-leaderboard - generated_from_trainer datasets: - mozilla-foundation/common_voice_11_0 metrics: - wer model-index: - name: Whisper Small Sv results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Common Voice 11.0 type: mozilla-foundation/common_voice_11_0 config: sv split: test[:10%] args: 'config: sv, split: test' metrics: - name: Wer type: wer value: 19.76284584980237 --- # Whisper Small Swedish This model is an adapted version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the Common Voice 11.0 dataset in Swedish. It achieves the following results on the evaluation set: - Wer: 19.8166 ## Model description & uses This model is the openai whisper small transformer adapted for Swedish audio to text transcription. The model is available through its [HuggingFace web app](https://huggingface.co/spaces/torileatherman/whisper_small_sv) ## Training and evaluation data Data used for training is the initial 10% of train and validation of [Swedish Common Voice](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0/viewer/sv/train) 11.0 from Mozilla Foundation. The dataset used for evaluation is the initial 10% of test of Swedish Common Voice. The training data has been augmented with random noise, random pitching and change of the speed of the voice. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 8 - eval_batch_size: 8 - lr_scheduler_type: constant - lr_scheduler_warmup_steps: 500 - training_steps: 4000 - weight decay: 0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Wer | |:-------------:|:-----:|:----:|:---------------:|:-------:| | 0.1379 | 0.95 | 1000 | 0.295811 | 21.467| | 0.0245 | 2.86 | 3000 | 0.300059 | 20.160 | | 0.0060 | 3.82 | 4000 | 0.320301 | 19.762 | ### Framework versions - Transformers 4.26.0.dev0 - Pytorch 1.12.1+cu113 - Datasets 2.7.1 - Tokenizers 0.13.2