ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs
Abstract
Motivated by the widespread increase in the phenomenon of code-switching between Egyptian Arabic and English in recent times, this paper explores the intricacies of machine translation (MT) and automatic speech recognition (ASR) systems, focusing on translating code-switched Egyptian Arabic-English to either English or Egyptian Arabic. Our goal is to present the methodologies employed in developing these systems, utilizing large language models such as LLama and Gemma. In the field of ASR, we explore the utilization of the Whisper model for code-switched Egyptian Arabic recognition, detailing our experimental procedures including data preprocessing and training techniques. Through the implementation of a consecutive speech-to-text translation system that integrates ASR with MT, we aim to overcome challenges posed by limited resources and the unique characteristics of the Egyptian Arabic dialect. Evaluation against established metrics showcases promising results, with our methodologies yielding a significant improvement of 56% in English translation over the state-of-the-art and 9.3% in Arabic translation. Since code-switching is deeply inherent in spoken languages, it is crucial that ASR systems can effectively handle this phenomenon. This capability is crucial for enabling seamless interaction in various domains, including business negotiations, cultural exchanges, and academic discourse. Our models and code are available as open-source resources. Code: http://github.com/ahmedheakl/arazn-llm}, Models: http://huggingface.co/collections/ahmedheakl/arazn-llm-662ceaf12777656607b9524e.
Community
This paper addresses the growing phenomenon of code-switching between Egyptian Arabic and English by developing machine translation (MT) and automatic speech recognition (ASR) systems to translate code-switched language into either English or Egyptian Arabic. Utilizing large language models like LLama and Gemma for MT, and the Whisper model for ASR, we detail our experimental procedures, including data preprocessing and training techniques. Our integrated speech-to-text translation system, designed to tackle limited resources and the unique traits of Egyptian Arabic, shows significant improvements: 56% in English translation and 9.3% in Arabic translation over state-of-the-art benchmarks. This advancement is critical for effective ASR in various domains. Our models and code are available as open-source resources.
Hi @ahmedheakl congrats on this work!!
Would it be possible to link the models and collection to this paper page? See here for more info: https://huggingface.co/docs/hub/en/model-cards#linking-a-paper
Done, thank you!
Models citing this paper 13
Browse 13 models citing this paperDatasets citing this paper 2
Spaces citing this paper 0
No Space linking this paper