--- library_name: transformers tags: - whisper - ASR - Akan - low-resource-language - speech-recognition datasets: - Lagyamfi/akan_audio_processed language: - ak - tw metrics: - wer base_model: - openai/whisper-small pipeline_tag: automatic-speech-recognition --- # Model Card for Akan Whisper Model ## Model Details ### Model Description This model is a fine-tuned version of OpenAI's Whisper model, designed for Automatic Speech Recognition (ASR) on the Akan language, a low-resource language spoken in Ghana. The model was trained on a dataset containing Akan audio clips and corresponding transcriptions, enabling it to transcribe spoken Akan into text. - **Developed by:** Mark Atta Mensah - **Shared by:** Mark Atta Mensah - **Model type:** Automatic Speech Recognition (ASR) - **Language(s) (NLP):** Akan (Twi) - **Finetuned from model:** openai/whisper-small ## Uses ### Direct Use This model can be directly used for transcribing Akan speech into text. It is suitable for applications like voice assistants, transcription services, and other language-based solutions that require Akan language support. ### Downstream Use This model can be fine-tuned further or incorporated into larger applications that require multi-language ASR capabilities or specific domain adaptation for Akan. ### Out-of-Scope Use The model is not suitable for languages other than Akan (Twi) and may not perform well on other low-resource languages without additional fine-tuning. ## Bias, Risks, and Limitations As with any ASR model, this model may have biases based on the dataset it was trained on. Potential biases in the training data could lead to underperformance on accents, dialects, or language variations not well represented in the data. ### Recommendations Users should be aware of these limitations and assess the model’s performance on specific applications before deployment. ## How to Get Started with the Model ```python from transformers import WhisperForConditionalGeneration, WhisperProcessor model = WhisperForConditionalGeneration.from_pretrained("GiftMark/akan-whisper-model") processor = WhisperProcessor.from_pretrained("GiftMark/akan-whisper-model") def transcribe(audio_array): inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt").input_features predicted_ids = model.generate(inputs) transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0] return transcription