Audio Feature Extraction Models

This repository contains pre-trained models for audio feature extraction, specifically:

  • Tempo Detection: Estimates the tempo (BPM) of an audio track.

Model Details

Tempo Model

  • Model Type: Custom CNN architecture for tempo classification.
  • Input: Audio segments converted to Mel spectrograms followed by autocorrelation.
  • Output: Predicts Beats Per Minute (BPM) in a range from [85, 170].

Key Detection Models

  • Key Class Model: Classifies into 12 relative key classes.
  • Key Quality Model: Determines if the key is Major or Minor.
  • Input: Audio segments converted to Mel spectrograms.
  • Output:
    • Key Class: One of 12 key signatures.
    • Key Quality: Binary classification (0 for Major, 1 for Minor).

Usage

Prerequisites

  • Python 3.7+
  • PyTorch
  • torchaudio
  • transformers

Loading Models

To use these models with Hugging Face's transformers library:

from transformers import [AutoModelForAudioClassification](https://x.com/i/grok?text=AutoModelForAudioClassification)

# Load Tempo Model
tempo_model = AutoModelForAudioClassification.from_pretrained("your_username/tempo_model")

# Load Key Models
key_class_model = AutoModelForAudioClassification.from_pretrained("your_username/key_class_model")
key_quality_model = AutoModelForAudioClassification.from_pretrained("your_username/key_quality_model")
Downloads last month
53
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.