jacob-c commited on
Commit
f27af8d
·
1 Parent(s): 1f2295d
README.md CHANGED
@@ -10,36 +10,25 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- # Music Classification with MIT's AST Model 🎵
14
 
15
- This Hugging Face Space demonstrates audio classification using MIT's Audio Spectrogram Transformer (AST) model. The model can identify various types of music, instruments, and sounds in audio files.
16
 
17
  ## Features
18
-
19
- - Simple, user-friendly interface
20
- - Support for multiple audio formats (WAV, MP3, OGG, FLAC)
21
- - Top-5 predictions with confidence scores
22
- - Real-time processing
23
-
24
- ## How to Use
25
-
26
- 1. Click the "Upload Music File" button or drag and drop an audio file
27
- 2. Wait a few seconds for the model to process the audio
28
- 3. View the classification results with confidence scores
29
-
30
- ## Model Details
31
-
32
- This app uses the `MIT/ast-finetuned-audioset-10-10-0.4593` model, which is trained on AudioSet and can recognize a wide variety of sounds and music styles. The model converts audio into spectrograms and uses a transformer architecture to classify the audio content.
33
-
34
- ## Technical Notes
35
-
36
- - The model processes audio at 16kHz
37
- - Results show top 5 predictions with confidence scores
38
- - Processing is done on Hugging Face's infrastructure
39
- - No local installation required
40
-
41
- ## Credits
42
-
43
- - Model: [MIT AST](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593)
44
- - Interface: Gradio
45
- - Deployment: Hugging Face Spaces
 
10
  license: mit
11
  ---
12
 
13
+ # Audio Classification App
14
 
15
+ This is an audio classification application that uses the MIT AST (Audio Spectrogram Transformer) model to classify audio files. The model can recognize various sounds and music categories from the AudioSet dataset.
16
 
17
  ## Features
18
+ - Simple web interface for audio file upload
19
+ - Real-time classification using Hugging Face's AST model
20
+ - Displays classification results in JSON format
21
+
22
+ ## Usage
23
+ 1. Open the web interface
24
+ 2. Upload an audio file (supports various formats including MP3, WAV, etc.)
25
+ 3. Wait for the classification results
26
+ 4. View the predicted categories and their confidence scores
27
+
28
+ ## Technical Details
29
+ - Built with Gradio for the web interface
30
+ - Uses Hugging Face's AST model (MIT/ast-finetuned-audioset-10-10-0.4593)
31
+ - Deployed on Hugging Face Spaces
32
+
33
+ ## Requirements
34
+ The required packages are listed in `requirements.txt`
 
 
 
 
 
 
 
 
 
 
 
TunePocket-Christmas-Spirit-10-Sec-Intro-Preview.mp3 ADDED
Binary file (166 kB). View file
 
requirements.txt CHANGED
@@ -5,3 +5,4 @@ torchaudio>=2.1.2
5
  numpy>=1.26.2
6
  accelerate>=0.25.0
7
  librosa>=0.10.1
 
 
5
  numpy>=1.26.2
6
  accelerate>=0.25.0
7
  librosa>=0.10.1
8
+ requests>=2.31.0