arxiv:2406.18120

ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs

Published on Jun 26

· Submitted by

ahmedheakl on Jun 28

Upvote

Authors:

Ahmed Heakl ,

Youssef Zaghloul ,

Rania Hossam ,

Walid Gomaa

Abstract

Motivated by the widespread increase in the phenomenon of code-switching between Egyptian Arabic and English in recent times, this paper explores the intricacies of machine translation (MT) and automatic speech recognition (ASR) systems, focusing on translating code-switched Egyptian Arabic-English to either English or Egyptian Arabic. Our goal is to present the methodologies employed in developing these systems, utilizing large language models such as LLama and Gemma. In the field of ASR, we explore the utilization of the Whisper model for code-switched Egyptian Arabic recognition, detailing our experimental procedures including data preprocessing and training techniques. Through the implementation of a consecutive speech-to-text translation system that integrates ASR with MT, we aim to overcome challenges posed by limited resources and the unique characteristics of the Egyptian Arabic dialect. Evaluation against established metrics showcases promising results, with our methodologies yielding a significant improvement of 56% in English translation over the state-of-the-art and 9.3% in Arabic translation. Since code-switching is deeply inherent in spoken languages, it is crucial that ASR systems can effectively handle this phenomenon. This capability is crucial for enabling seamless interaction in various domains, including business negotiations, cultural exchanges, and academic discourse. Our models and code are available as open-source resources. Code: http://github.com/ahmedheakl/arazn-llm}, Models: http://huggingface.co/collections/ahmedheakl/arazn-llm-662ceaf12777656607b9524e.

View arXiv page View PDF Add to collection

Community

ahmedheakl

Paper author Paper submitter Jun 28

•

edited Jun 28

This paper addresses the growing phenomenon of code-switching between Egyptian Arabic and English by developing machine translation (MT) and automatic speech recognition (ASR) systems to translate code-switched language into either English or Egyptian Arabic. Utilizing large language models like LLama and Gemma for MT, and the Whisper model for ASR, we detail our experimental procedures, including data preprocessing and training techniques. Our integrated speech-to-text translation system, designed to tackle limited resources and the unique traits of Egyptian Arabic, shows significant improvements: 56% in English translation and 9.3% in Arabic translation over state-of-the-art benchmarks. This advancement is critical for effective ASR in various domains. Our models and code are available as open-source resources.

AdinaY

Jun 28

https://huggingface.co/collections/ahmedheakl/arzen-llm-662ceaf12777656607b9524e

nielsr

Jul 1

Hi @ahmedheakl congrats on this work!!

Would it be possible to link the models and collection to this paper page? See here for more info: https://huggingface.co/docs/hub/en/model-cards#linking-a-paper

ahmedheakl

Paper author Jul 1

Done, thank you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 13

Browse 13 models citing this paper

Datasets citing this paper 2

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2406.18120 in a Space README.md to link it from this page.