Multi Language Support??

#3
by Sesideh - opened

As the open AI model that you are using is multi-language, do you think it is possible to make the model also a multi-language speech emotion detection?

I was inspired by two papers, namely “Breaking the Silence: Whisper-Driven Emotion Recognition in AI Mental Support Models” and “EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Benchmark”, to develop a Whisper Large V3-based Speech Emotion Recognition (SER) project. This model excels in multilingual transcription and is resistant to accents and noise, making it a potential candidate for SER.

In the first paper, the integration of the Whisper encoder with an additional Transformer layer improved emotion detection accuracy to 95%. Meanwhile, the second paper shows that Whisper Large V3 excels in cross-corpus settings, proving its capability in multilingual SER.

Based on these two references, I am working on adapting the Whisper Large V3 architecture for multilingual Speech Emotion Recognition. However, I am still in the learning process, so if there are any inaccuracies or areas for improvement, I would greatly appreciate any suggestions and feedback to help me learn and refine my approach further.

Sign up or log in to comment