audioclassification / article.md
kurianbenoy's picture
Update article format
fd0c36b
|
raw
history blame
2.16 kB

During first lesson of Practical Deep Learning for Coders course, Jeremy had mentioned how using simple computer vision model by being a bit creative we can build a state of the art model to classify audio with same image classification model. I was curious on how I can train an music classifier, as I have never worked on audio data before.

You can find how I trained this music genre classification using fast.ai.

Dataset

  1. The competition data
  2. Image data generated from converting audio to melspectograms in form of images

Training

Fast.ai was used to train this classifier with a ResNet50 vision learner for 10 epochs.

epoch train_loss valid_loss error_rate time
0 2.312176 1.843815 0.558654 02:07
1 2.102361 1.719162 0.539061 02:08
2 1.867139 1.623988 0.527003 02:08
3 1.710557 1.527913 0.507661 02:07
4 1.629478 1.456836 0.479779 02:05
5 1.519305 1.433036 0.474253 02:05
6 1.457465 1.379757 0.464456 02:05
7 1.396283 1.369344 0.457925 02:05
8 1.359388 1.367973 0.453655 02:05
9 1.364363 1.368887 0.456167 02:04
epoch train_loss valid_loss error_rate time
0 1.358123 1.100139 0.288713 05:14
1 1.129988 0.985213 0.260693 05:12
2 0.964907 0.909715 0.241337 05:17
3 0.804738 0.843515 0.222475 05:19
4 0.638846 0.795957 0.205347 05:16
5 0.475434 0.750069 0.192673 05:15
6 0.345060 0.742432 0.185198 05:12
7 0.247938 0.728758 0.177624 05:12
8 0.214708 0.727486 0.177871 05:11

Examples

The example images provided in the demo are from the validation data from Kaggle competition data, which was not used during training.