plz, help
Hello, I am a college student who is conducting a voice classification study. I am currently working on fin-tuning of the ast model. But there's a problem, and the dataset I have is not learning properly. The problem is trainer.train(), but I wonder what the shape of the dataset of train_dataset in trainer.train() is when fine tuning. Could you please share some things that I can help you with, such as sharing or advice? I beg you. Thank you.
Hi @coldpumpkinn , no worries.
I am not completely sure what your current issue is as you've provided minimal information, but I'll try and give you as many pointers as I can. Firstly, if the training doesn't start properly, you can refer to this guide from HuggingFace which shows you how to fine-tune a model for audio classification. As for creating an audio dataset, you can start with this guide. If the issue is that the training is not converging properly and you are sure that the dataset has been prepared correctly, you might want to try and fine-tune a different model, i.e. not from our distil-ast-audioset
. In the first guide which I gave, they fine-tuned facebook/wav2vec2-base
, you can try that out as well. I hope this helps. Cheers.