Edit model card

πŸ‘³ Arabic-Whisper-CodeSwitching-Edition

This model is a fine-tuned version of Whisper Large v2 by OpenAI, trained on an Arabic-English-code-switching dataset.

image/png

πŸ“ Model Details

Model Description

The Arabic-Whisper-CodeSwitching-Edition is designed to handle Arabic audio with embedded English words. This model enhances the original Whisper Large v2 by improving its performance on Arabic-English code-switching speech

  • Developed by: Ψ§Ω„ΨΉΨ¨Ψ― Ω„Ω„Ω‡
  • Model type: Speech Recognition
  • Language(s) (NLP): Arabic, English (in the context of Arabic audio)
  • License: GPL-3.0

Model Sources [optional]

πŸ‘· Uses

Direct Use

The model can be used directly for transcribing Arabic speech that includes English words. It is particularly useful in multilingual environments where code-switching is common.

Out-of-Scope Use

The model may not perform well on monolingual speech in languages other than Arabic or English, or on speech with code-switching in languages other than Arabic and English.

😨 Bias, Risks, and Limitations

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. More information needed for further recommendations.

πŸ” How to Get Started with the Model

Use the code below to get started with the model.

from transformers import WhisperForConditionalGeneration, WhisperProcessor

processor = WhisperProcessor.from_pretrained("MohamedRashad/Arabic-Whisper-CodeSwitching-Edition")
model = WhisperForConditionalGeneration.from_pretrained("MohamedRashad/Arabic-Whisper-CodeSwitching-Edition")

# Example usage
inputs = processor("path_to_audio_file.wav", return_tensors="pt")
generated_ids = model.generate(inputs["input_features"])
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(transcription)

πŸ‘¨β€πŸŽ“ Citation

BibTeX:

@misc{rashad2024arabicwhisper,
  title={Arabic-Whisper-CodeSwitching-Edition},
  author={Mohamed Rashad},
  year={2024},
  url={https://huggingface.co/spaces/MohamedRashad/Arabic-Whisper-CodeSwitching-Edition},
}

APA:

Rashad, M. (2024). Arabic-Whisper-CodeSwitching-Edition. Retrieved from https://huggingface.co/spaces/MohamedRashad/Arabic-Whisper-CodeSwitching-Edition

Downloads last month
333
Safetensors
Model size
1.54B params
Tensor type
BF16
Β·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train MohamedRashad/Arabic-Whisper-CodeSwitching-Edition

Space using MohamedRashad/Arabic-Whisper-CodeSwitching-Edition 1