Whisper model fine-tuned on different languages
AI & ML interests
Audio and Multimodal Learning
Recent Activity
Organization Card
ALM: Audio Language and Multimodal
ALM is a collaborative research group focused on deep learning for audio, language, and multimodal data.
About Us
- Alkis Koudounas - PhD Student at Politecnico di Torino (Profile | polito.it)
- Lorenzo Vaiani - PhD Student at Politecnico di Torino (Profile | polito.it)
- Moreno La Quatra - Research Fellow at Kore University of Enna (Profile | unikore.it)
Projects
- ARCH - Audio Representation Benchmark (Repo): A platform dedicated to benchmarking models for audio representations. Resaerch Paper
- CALM - Contrastive Alignment of Language and Music: A project from the 1st Sound of AI Hackathon. CALM aligns songs with natural language descriptions, enabling music searches via text, voice, or facial expressions.
- PACE - Podcast AI for Chapters and Episodes: PACE is a semantic search engine for podcasts. It enables users to search for specific parts of a podcast using natural language. The project was created for the AssemblyAI 50K Hackathon - Winter 2022.
Collections
2
spaces
2
models
14

ALM/whisper-el-small-augmented
Automatic Speech Recognition
•
Updated
•
57

ALM/whisper-cy-small-augmented
Automatic Speech Recognition
•
Updated
•
92

ALM/whisper-it-small-augmented
Automatic Speech Recognition
•
Updated
•
84
•
1

ALM/whisper-sk-small-augmented
Automatic Speech Recognition
•
Updated
•
118

ALM/whisper-da-small-augmented
Automatic Speech Recognition
•
Updated
•
98

ALM/whisper-it-medium-augmented
Automatic Speech Recognition
•
Updated
•
40
•
1

ALM/wav2vec2-large-audioset
Audio Classification
•
Updated
•
106

ALM/hubert-base-audioset
Audio Classification
•
Updated
•
120
•
2

ALM/hubert-large-audioset
Audio Classification
•
Updated
•
99

ALM/wav2vec2-base-audioset
Audio Classification
•
Updated
•
128
datasets
None public yet