Achieves a WER of 14.4 in test coverage
Training added up to 10 Hours With 40K samples ranging from 5 to 30 seconds of audio.
A100 + 2(T4)