anthony-wss
commited on
Commit
•
a8aa9fc
1
Parent(s):
37e427d
Create README
Browse files
README.md
ADDED
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
tags:
|
5 |
+
- AudioClassification
|
6 |
+
datasets:
|
7 |
+
- marsyas/gtzan
|
8 |
+
metrics:
|
9 |
+
- accuracy
|
10 |
+
---
|
11 |
+
|
12 |
+
# Audio Classification
|
13 |
+
|
14 |
+
This repo contains code and notes for [this tutorial](https://huggingface.co/learn/audio-course/chapter4/fine-tuning).
|
15 |
+
|
16 |
+
## Dataset
|
17 |
+
|
18 |
+
[GTZAN](https://huggingface.co/datasets/marsyas/gtzan) is used.
|
19 |
+
|
20 |
+
## Usage
|
21 |
+
|
22 |
+
```shell
|
23 |
+
export HUGGINGFACE_TOKEN=<your_token>
|
24 |
+
python main.py
|
25 |
+
```
|
26 |
+
|
27 |
+
## Performance
|
28 |
+
|
29 |
+
Acc: 0.81 (default setting)
|
30 |
+
|
31 |
+
## Notes
|
32 |
+
|
33 |
+
1. 🤗 Datasets support `train_test_split()` method to split the dataset.
|
34 |
+
|
35 |
+
2. `feature_extractor` can not handle resampling
|
36 |
+
- To resample, one can use `dataset.map()`
|
37 |
+
```python
|
38 |
+
from datasets import Audio
|
39 |
+
|
40 |
+
gtzan = gtzan.cast_column("audio", Audio(sampling_rate=feature_extractor.sampling_rate))
|
41 |
+
```
|
42 |
+
|
43 |
+
3. `feature_extractor` do the normalization and returns `input_values` and `attention_mask`.
|
44 |
+
|
45 |
+
4. `.map()` support batched preprocess.
|
46 |
+
|
47 |
+
5. Why `AutoModelForAudioClassification.from_pretrained` takes `label2id` and `id2label`?
|