metadata

library_name: transformers
tags:
  - whisper
  - ASR
  - Akan
  - low-resource-language
  - speech-recognition
datasets:
  - Lagyamfi/akan_audio_processed
language:
  - ak
  - tw
metrics:
  - wer
base_model:
  - openai/whisper-small
pipeline_tag: automatic-speech-recognition

Model Card for Akan Whisper Model

Model Details

Model Description

This model is a fine-tuned version of OpenAI's Whisper model, designed for Automatic Speech Recognition (ASR) on the Akan language, a low-resource language spoken in Ghana. The model was trained on a dataset containing Akan audio clips and corresponding transcriptions, enabling it to transcribe spoken Akan into text.

Developed by: Mark Atta Mensah
Shared by: Mark Atta Mensah
Model type: Automatic Speech Recognition (ASR)
Language(s) (NLP): Akan (Twi)
Finetuned from model: openai/whisper-small

Uses

Direct Use

This model can be directly used for transcribing Akan speech into text. It is suitable for applications like voice assistants, transcription services, and other language-based solutions that require Akan language support.

Downstream Use

This model can be fine-tuned further or incorporated into larger applications that require multi-language ASR capabilities or specific domain adaptation for Akan.

Out-of-Scope Use

The model is not suitable for languages other than Akan (Twi) and may not perform well on other low-resource languages without additional fine-tuning.

Bias, Risks, and Limitations

As with any ASR model, this model may have biases based on the dataset it was trained on. Potential biases in the training data could lead to underperformance on accents, dialects, or language variations not well represented in the data.

Recommendations

Users should be aware of these limitations and assess the model’s performance on specific applications before deployment.

How to Get Started with the Model

from transformers import WhisperForConditionalGeneration, WhisperProcessor

model = WhisperForConditionalGeneration.from_pretrained("GiftMark/akan-whisper-model")
processor = WhisperProcessor.from_pretrained("GiftMark/akan-whisper-model")

def transcribe(audio_array):
    inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt").input_features
    predicted_ids = model.generate(inputs)
    transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
    return transcription