Spaces:
Runtime error
Runtime error
File size: 2,159 Bytes
ac1ef50 fd0c36b ac1ef50 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
During first lesson of Practical Deep Learning for Coders course, Jeremy had mentioned how using simple computer vision model by being a bit creative we can build a state of the art model to classify audio with same image classification model. I was curious on how I can train an music classifier, as I have never worked on audio data before.
[You can find how I trained this music genre classification using fast.ai](https://kurianbenoy.com/ml-blog/fastai/fastbook/2022/05/01/AudioCNNDemo.html).
## Dataset
1. [The competition data](https://www.kaggle.com/competitions/kaggle-pog-series-s01e02/data)
2. [Image data generated from converting audio to melspectograms in form of images](https://www.kaggle.com/datasets/dienhoa/music-genre-spectrogram-pogchamps)
## Training
Fast.ai was used to train this classifier with a ResNet50 vision learner for 10 epochs.
| epoch | train_loss | valid_loss | error_rate | time |
|-------|---------------|---------------|---------------|-------|
|0 | 2.312176 | 1.843815 | 0.558654 | 02:07 |
|1 | 2.102361 | 1.719162 | 0.539061 | 02:08 |
|2 | 1.867139 | 1.623988 | 0.527003 | 02:08 |
|3 | 1.710557 | 1.527913 | 0.507661 | 02:07 |
|4 | 1.629478 | 1.456836 | 0.479779 | 02:05 |
|5 | 1.519305 | 1.433036 | 0.474253 | 02:05 |
|6 | 1.457465 | 1.379757 | 0.464456 | 02:05 |
|7 | 1.396283 | 1.369344 | 0.457925 | 02:05 |
|8 | 1.359388 | 1.367973 | 0.453655 | 02:05 |
|9 | 1.364363 | 1.368887 | 0.456167 | 02:04 |
| epoch | train_loss | valid_loss | error_rate | time |
|-------|---------------|---------------|---------------|-------|
| 0 | 1.358123 | 1.100139 | 0.288713 | 05:14 |
| 1 | 1.129988 | 0.985213 | 0.260693 | 05:12 |
| 2 | 0.964907 | 0.909715 | 0.241337 | 05:17 |
| 3 | 0.804738 | 0.843515 | 0.222475 | 05:19 |
| 4 | 0.638846 | 0.795957 | 0.205347 | 05:16 |
| 5 | 0.475434 | 0.750069 | 0.192673 | 05:15 |
| 6 | 0.345060 | 0.742432 | 0.185198 | 05:12 |
| 7 | 0.247938 | 0.728758 | 0.177624 | 05:12 |
| 8 | 0.214708 | 0.727486 | 0.177871 | 05:11 |
## Examples
The example images provided in the demo are from the validation data from Kaggle competition data, which was not used during training.
|