metadata
library_name: transformers
license: apache-2.0
base_model: openai/whisper-large-v3
tags:
- generated_from_trainer
metrics:
- wer
model-index:
- name: whisper-large-v3-ft-cv-cy-en
results: []
datasets:
- techiaith/commonvoice_18_0_cy_en
language:
- cy
- en
pipeline_tag: automatic-speech-recognition
whisper-large-v3-ft-cv-cy-en
This model is a fine-tuned version of openai/whisper-large-v3 on the techiaith/commonvoice_18_0_cy_en dataset. Both the English and Welsh data have been used to fine-tune the whisper model for transcribing both languages as well as improved language detection.
It achieves a success rate of 98.86% for language detection on recordings from a Common Voice bilingual test set
While, it achieves the following WER results for transcribing using the same test set:
- Welsh: 26.20
- English: 15.37
- Average: 20.70
N.B. the desired transcript language is not given to the fine-tuned model during testing.
Usage
from transformers import pipeline
transcriber = pipeline("automatic-speech-recognition", model="techiaith/whisper-large-v3-ft-cv-cy-en")
result = transcriber(<path or url to soundfile>)
print (result)
{'text': 'Mae hen wlad fy nhadau yn annwyl i mi.'}