README.md · Oblivion208/whisper-large-v2-lora-cantonese at 3a53e5bdc95ea6ade395a1bd8cd23a6371500f5f

metadata

library_name: peft
license: apache-2.0
datasets:
  - mozilla-foundation/common_voice_11_0
language:
  - yue
metrics:
  - cer
pipeline_tag: automatic-speech-recognition

🤗 HF Repo •🐱 Github Repo

Approximate Performance Evaluation

The following models are all trained and evaluated on a single RTX 3090 GPU.

Cantonese Test Results Comparison

MDCC

Model name	Parameters	Finetune Steps	Time Spend	Training Loss	Validation Loss	CER %	Finetuned Model
whisper-tiny-cantonese	39 M	3200	4h 34m	0.0485	0.771	11.10	Link
whisper-base-cantonese	74 M	7200	13h 32m	0.0186	0.477	7.66	Link
whisper-small-cantonese	244 M	3600	6h 38m	0.0266	0.137	6.16	Link
whisper-small-lora-cantonese	3.5 M	8000	21h 27m	0.0687	0.382	7.40	Link
whisper-large-v2-lora-cantonese	15 M	10000	33h 40m	0.0046	0.277	3.77	Link

Common Voice Corpus 11.0

Model name	Original CER %	w/o Finetune CER %	Jointly Finetune CER %
whisper-tiny-cantonese	124.03	66.85	35.87
whisper-base-cantonese	78.24	61.42	16.73
whisper-small-cantonese	52.83	31.23	/
whisper-small-lora-cantonese	37.53	19.38	14.73
whisper-large-v2-lora-cantonese	37.53	19.38	9.63