--- datasets: - mozilla-foundation/common_voice_16_1 language: - ta metrics: - wer pipeline_tag: automatic-speech-recognition --- This model is fine-tuned on the Tamil dataset from Common Voice 16.1, preprocessed using Epitran for transliterating text into IPA. The 'tam-Taml' code was employed to generate a precise phoneme list, crucial for capturing the nuances of Tamil phonetics: * Vowels: * Monophthongs:'a', 'aː', 'e', 'eː', 'i', 'iː', 'o', 'oː', 'u', 'uː' * Diphthongs: 'aj', 'aʋ' * Consonants: * Nasals: 'm', 'n̪', 'n', 'ɳ', 'ɲ', 'ŋ' * Stops: 'p', 't̪', 'ʈ', 'k', * Affricates: 't͡ʃ', 'd͡ʒ' * Fricatives: 's', 'ʂ', 'ʃ', 'h' * Tap: 'ɾ' * Trill: 'r' * Approximants: 'ʋ','ɻ', 'j', 'l', 'ɭ' * Consonant cluster: 'kʂ' * Special Symbols: '்' (denotes the absence of inherent vowel)