Spaces:
Running
Running
- README.md +19 -30
- TunePocket-Christmas-Spirit-10-Sec-Intro-Preview.mp3 +0 -0
- requirements.txt +1 -0
README.md
CHANGED
@@ -10,36 +10,25 @@ pinned: false
|
|
10 |
license: mit
|
11 |
---
|
12 |
|
13 |
-
#
|
14 |
|
15 |
-
This
|
16 |
|
17 |
## Features
|
18 |
-
|
19 |
-
-
|
20 |
-
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
- The model processes audio at 16kHz
|
37 |
-
- Results show top 5 predictions with confidence scores
|
38 |
-
- Processing is done on Hugging Face's infrastructure
|
39 |
-
- No local installation required
|
40 |
-
|
41 |
-
## Credits
|
42 |
-
|
43 |
-
- Model: [MIT AST](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593)
|
44 |
-
- Interface: Gradio
|
45 |
-
- Deployment: Hugging Face Spaces
|
|
|
10 |
license: mit
|
11 |
---
|
12 |
|
13 |
+
# Audio Classification App
|
14 |
|
15 |
+
This is an audio classification application that uses the MIT AST (Audio Spectrogram Transformer) model to classify audio files. The model can recognize various sounds and music categories from the AudioSet dataset.
|
16 |
|
17 |
## Features
|
18 |
+
- Simple web interface for audio file upload
|
19 |
+
- Real-time classification using Hugging Face's AST model
|
20 |
+
- Displays classification results in JSON format
|
21 |
+
|
22 |
+
## Usage
|
23 |
+
1. Open the web interface
|
24 |
+
2. Upload an audio file (supports various formats including MP3, WAV, etc.)
|
25 |
+
3. Wait for the classification results
|
26 |
+
4. View the predicted categories and their confidence scores
|
27 |
+
|
28 |
+
## Technical Details
|
29 |
+
- Built with Gradio for the web interface
|
30 |
+
- Uses Hugging Face's AST model (MIT/ast-finetuned-audioset-10-10-0.4593)
|
31 |
+
- Deployed on Hugging Face Spaces
|
32 |
+
|
33 |
+
## Requirements
|
34 |
+
The required packages are listed in `requirements.txt`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
TunePocket-Christmas-Spirit-10-Sec-Intro-Preview.mp3
ADDED
Binary file (166 kB). View file
|
|
requirements.txt
CHANGED
@@ -5,3 +5,4 @@ torchaudio>=2.1.2
|
|
5 |
numpy>=1.26.2
|
6 |
accelerate>=0.25.0
|
7 |
librosa>=0.10.1
|
|
|
|
5 |
numpy>=1.26.2
|
6 |
accelerate>=0.25.0
|
7 |
librosa>=0.10.1
|
8 |
+
requests>=2.31.0
|