File size: 4,606 Bytes
60efa5c
 
 
 
 
 
 
 
f116579
 
 
60efa5c
 
 
f116579
 
 
60efa5c
 
 
 
a0391d2
 
60efa5c
 
 
a0391d2
 
 
 
 
 
 
 
 
 
 
 
60efa5c
 
 
f116579
 
 
60efa5c
 
 
f116579
60efa5c
 
 
f116579
60efa5c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a0391d2
60efa5c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f116579
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- audiofolder
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: wav2vec2-base-Drum_Kit_Sounds
  results: []
language:
- en
pipeline_tag: audio-classification
---

# wav2vec2-base-Drum_Kit_Sounds

This model is a fine-tuned version of [facebook/wav2vec2-base](https://huggingface.co/facebook/wav2vec2-base).

It achieves the following results on the evaluation set:
- Loss: 1.0887
- Accuracy: 0.7812
- F1
  - Weighted: 0.7692
  - Micro: 0.7812
  - Macro: 0.7845
- Recall
  - Weighted: 0.7812
  - Micro: 0.7812
  - Macro: 0.8187
- Precision
  - Weighted: 0.8717
  - Micro: 0.7812
  - Macro: 0.8534

## Model description

This is a multiclass classification of sounds to determine which type of drum is hit in the audio sample. The options are: kick, overheads, snare, and toms.

For more information on how it was created, check out the following link: https://github.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/blob/main/Audio-Projects/Classification/Audio-Drum_Kit_Sounds.ipynb

## Intended uses & limitations

This model is intended to demonstrate my ability to solve a complex problem using technology.

## Training and evaluation data

Dataset Source: https://www.kaggle.com/datasets/anubhavchhabra/drum-kit-sound-samples

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 12

### Training results

| Training Loss | Epoch | Step | Validation Loss | Accuracy | Weighted F1 | Micro F1 | Macro F1 | Weighted Recall | Micro Recall | Macro Recall | Weighted Precision | Micro Precision | Macro Precision |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:-----------:|:--------:|:--------:|:---------------:|:------------:|:------------:|:------------------:|:---------------:|:---------------:|
| 1.3743        | 1.0   | 4    | 1.3632          | 0.5625   | 0.5801      | 0.5625   | 0.5678   | 0.5625          | 0.5625       | 0.5670       | 0.6786             | 0.5625          | 0.6429          |
| 1.3074        | 2.0   | 8    | 1.3149          | 0.3438   | 0.2567      | 0.3438   | 0.2696   | 0.3438          | 0.3438       | 0.375        | 0.3067             | 0.3438          | 0.3148          |
| 1.2393        | 3.0   | 12   | 1.3121          | 0.2188   | 0.0785      | 0.2188   | 0.0897   | 0.2188          | 0.2188       | 0.25         | 0.0479             | 0.2188          | 0.0547          |
| 1.2317        | 4.0   | 16   | 1.3112          | 0.2812   | 0.1800      | 0.2812   | 0.2057   | 0.2812          | 0.2812       | 0.3214       | 0.2698             | 0.2812          | 0.3083          |
| 1.2107        | 5.0   | 20   | 1.2604          | 0.4375   | 0.3030      | 0.4375   | 0.3462   | 0.4375          | 0.4375       | 0.5          | 0.2552             | 0.4375          | 0.2917          |
| 1.1663        | 6.0   | 24   | 1.2112          | 0.4688   | 0.3896      | 0.4688   | 0.4310   | 0.4688          | 0.4688       | 0.5268       | 0.5041             | 0.4688          | 0.5404          |
| 1.1247        | 7.0   | 28   | 1.1746          | 0.5938   | 0.5143      | 0.5938   | 0.5603   | 0.5938          | 0.5938       | 0.6562       | 0.5220             | 0.5938          | 0.5609          |
| 1.0856        | 8.0   | 32   | 1.1434          | 0.5938   | 0.5143      | 0.5938   | 0.5603   | 0.5938          | 0.5938       | 0.6562       | 0.5220             | 0.5938          | 0.5609          |
| 1.0601        | 9.0   | 36   | 1.1417          | 0.6562   | 0.6029      | 0.6562   | 0.6389   | 0.6562          | 0.6562       | 0.7125       | 0.8440             | 0.6562          | 0.8217          |
| 1.0375        | 10.0  | 40   | 1.1227          | 0.6875   | 0.6582      | 0.6875   | 0.6831   | 0.6875          | 0.6875       | 0.7330       | 0.8457             | 0.6875          | 0.8237          |
| 1.0168        | 11.0  | 44   | 1.1065          | 0.7812   | 0.7692      | 0.7812   | 0.7845   | 0.7812          | 0.7812       | 0.8187       | 0.8717             | 0.7812          | 0.8534          |
| 1.0093        | 12.0  | 48   | 1.0887          | 0.7812   | 0.7692      | 0.7812   | 0.7845   | 0.7812          | 0.7812       | 0.8187       | 0.8717             | 0.7812          | 0.8534          |


### Framework versions

- Transformers 4.25.1
- Pytorch 1.12.1
- Datasets 2.8.0
- Tokenizers 0.12.1