Audio-to-Audio
French
audio
speech

Master Model Card: Vibravox Audio Bandwidth extension Models

Overview

This master model card serves as an entry point for exploring multiple audio bandwidth extension (BWE) models trained on different sensor data from the Vibravox dataset.

These models are designed to to enhance the audio quality of body-conducted captured speech, by denoising and regenerating mid and high frequencies from low frequency content only.

The models are trained on specific sensors to address various audio capture scenarios using body conducted sound and vibration sensors.

Disclaimer

Each of these models has been trained for specific non-conventional speech sensors and is intended to be used with in-domain data.

Please be advised that using these models outside their intended sensor data may result in suboptimal performance.

Usage

All models are trained using Configurable EBEN (see publication in IEEE TASLP - arXiv link) and adapted to different sensor inputs. They are intended to be used at a sample rate of 16kHz.

Training Procedure

Detailed instructions for reproducing the experiments are available on the jhauret/vibravox Github repository and in the VibraVox paper on arXiV.

Available Models

The following models are available, each trained on a different sensor on the speech_clean or synthetically mixed speech_clean and speechless-noisy subsets of (https://huggingface.co/datasets/Cnam-LMSSC/vibravox):

Transducer EBEN configuration Huggingface model trained on speech-clean link Huggingface model trained on synthetically mixed speech-clean and speechless-noisy link
In-ear comply foam-embedded microphone M=4,P=2,Q=4 EBEN_soft_in_ear_microphone EBEN_noisy_soft_in_ear_microphone
In-ear rigid earpiece-embedded microphone M=4,P=2,Q=4 EBEN_rigid_in_ear_microphone EBEN_noisy_rigid_in_ear_microphone
Forehead miniature vibration sensor M=4,P=4,Q=4 EBEN_forehead_accelerometer EBEN_noisy_forehead_accelerometer
Temple vibration pickup M=4,P=1,Q=4 EBEN_temple_vibration_pickup EBEN_noisy_temple_vibration_pickup
Laryngophone M=4,P=2,Q=4 EBEN_throat_microphone EBEN_noisy_throat_microphone
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train Cnam-LMSSC/vibravox_EBEN_models

Collection including Cnam-LMSSC/vibravox_EBEN_models