Respeecher/ukrainian-data2vec-asr

This model is a fine-tuned version of Respeecher/ukrainian-data2vec on the Common Voice 11.0 dataset Ukrainian Train part. It achieves the following results:

  • eval_wer: 17.634350000973198
  • test_wer: 17.042283338786351

How to Get Started with the Model

from transformers import AutoProcessor, Data2VecAudioForCTC
import torch
from datasets import load_dataset, Audio

dataset = load_dataset("mozilla-foundation/common_voice_11_0", "uk", split="test")
# Resample
dataset = dataset.cast_column("audio", Audio(sampling_rate=16_000))

processor = AutoProcessor.from_pretrained("Respeecher/ukrainian-data2vec-asr")
model = Data2VecAudioForCTC.from_pretrained("Respeecher/ukrainian-data2vec-asr")
model.eval()

sampling_rate = dataset.features["audio"].sampling_rate
inputs = processor(dataset[1]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
predicted_ids = torch.argmax(logits, dim=-1)

transcription = processor.batch_decode(predicted_ids)
transcription[0]

Training Details

Training code and instructions are available on our github

Downloads last month
75
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train Respeecher/ukrainian-data2vec-asr

Evaluation results