Fine-tunining Whisper models for shorter audio segments

by Malishevsky - opened

Hi all. My project needs to recognize many short audio parts. Can I use fine to change the multilingual model for short audios like 10 seconds ? If not, can I train the model from scratch for these purposes? I would be grateful for any help and hints.

Hi, this is a general question about Whisper and it seems you already asked it in openai/whisper which is a better place to ask this type of questions:

So I'm closing the discussion here.

guillaumekln changed discussion status to closed

Sign up or log in to comment