library_name: transformers
tags:
- whisper
- ASR
- Akan
- low-resource-language
- speech-recognition
datasets:
- Lagyamfi/akan_audio_processed
language:
- ak
- tw
metrics:
- wer
base_model:
- openai/whisper-small
pipeline_tag: automatic-speech-recognition
Model Card for Akan Whisper Model
Model Details
Model Description
This model is a fine-tuned version of OpenAI's Whisper model, designed for Automatic Speech Recognition (ASR) on the Akan language, a low-resource language spoken in Ghana. The model was trained on a dataset containing Akan audio clips and corresponding transcriptions, enabling it to transcribe spoken Akan into text.
- Developed by: Mark Atta Mensah
- Shared by: Mark Atta Mensah
- Model type: Automatic Speech Recognition (ASR)
- Language(s) (NLP): Akan (Twi)
- Finetuned from model: openai/whisper-small
Uses
Direct Use
This model can be directly used for transcribing Akan speech into text. It is suitable for applications like voice assistants, transcription services, and other language-based solutions that require Akan language support.
Downstream Use
This model can be fine-tuned further or incorporated into larger applications that require multi-language ASR capabilities or specific domain adaptation for Akan.
Out-of-Scope Use
The model is not suitable for languages other than Akan (Twi) and may not perform well on other low-resource languages without additional fine-tuning.
Bias, Risks, and Limitations
As with any ASR model, this model may have biases based on the dataset it was trained on. Potential biases in the training data could lead to underperformance on accents, dialects, or language variations not well represented in the data.
Recommendations
Users should be aware of these limitations and assess the model’s performance on specific applications before deployment.
How to Get Started with the Model
from transformers import WhisperForConditionalGeneration, WhisperProcessor
model = WhisperForConditionalGeneration.from_pretrained("GiftMark/akan-whisper-model")
processor = WhisperProcessor.from_pretrained("GiftMark/akan-whisper-model")
def transcribe(audio_array):
inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt").input_features
predicted_ids = model.generate(inputs)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
return transcription