Automatic Speech Recognition
TensorBoard
Safetensors
Welsh
whisper
Generated from Trainer
verbatim

whisper-large-v3-ft-btb-cv-cy

This model is a version of openai/whisper-large-v3 finedtuned with transcriptions of Welsh language spontaneous speech Banc Trawsgrifiadau Bangor (btb) ac well as recordings of read speach from Welsh Common Voice version 18 (cv) for additional training.

As such this model is suitable for more verbatim transcribing of spontaneous or unplanned speech. It achieves the following results on the Banc Trawsgrifiadau Bangor'r test set

  • WER: 29.72
  • CER: 11.01

Usage

from transformers import pipeline

transcriber = pipeline("automatic-speech-recognition", model="techiaith/whisper-large-v3-ft-btb-cv-cy")
result = transcriber(<path or url to soundfile>)
print (result)

{'text': 'ymm, yn y pum mlynadd dwitha 'ma ti 'di... Ie. ...bod drw dipyn felly do?'}

Downloads last month
27
Safetensors
Model size
1.54B params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for techiaith/whisper-large-v3-ft-btb-cv-cy

Finetuned
(349)
this model

Datasets used to train techiaith/whisper-large-v3-ft-btb-cv-cy

Collection including techiaith/whisper-large-v3-ft-btb-cv-cy