Testing the Model with an Example Wav File
Hey,
Does anyone know how to use this model with an Example wav file? I want to use an Audio file in my /content/ directory of my Colab notebook
-- path = 'example.wav'
and thought I can just exchange the line
-- process_func(signal, sampling_rate)
with
-- process_func(path, sampling_rate)
But sadly its not that easy. Can anyone help?
Hi,
please have a look at the following tutorial:
https://github.com/audeering/w2v2-how-to
I have now added also added a link in the model card.
cheers
Johannes
Thanks for the fast reply, I looked into the tutorial and it helped a lot!
I have loaded the model as in the tutorial and tried to evaluate it on a few wav files. Most of the time I got promising results but a few times the values of arousal, dominance and valence where shooting over the boundaries of 1.0. As far of my understanding the values of these parameters should be in the range of 0 - 1 or have I maybe misread the paper?
import os
import librosa
import audonnx
import audinterface
Load the model
model_root = 'model'
model = audonnx.load(model_root)
Define the input signal
input_file = '/content/Pulp Fiction Best Scene - Does He Look Like a Bitch.mp4.39.wav'
signal, sampling_rate = librosa.load(input_file, sr=16000, mono=True)
Create an interface to process the signal
interface = audinterface.Feature(
model.labels('logits'),
process_func=model,
process_func_args={'outputs': 'logits'},
sampling_rate=sampling_rate,
resample=True,
verbose=True
)
Process the signal using the interface
output = interface.process_signal(signal, sampling_rate)
print(output)
Results:
Arousal: 1.078532
Dominance: 1.043147
Valence: -0.137843
I used a rather aggressive line of Samuel L. Jackson of the movie Pulp Fiction.
I used Librosa to downsample the sample Rate to 16000Hz. Was this maybe the Problem for the deviating results?
It can indeed happen that in rare cases you will observe values slightly out of the expected range of [0..1]. And as you mention, your example is indeed quite extreme. If your application expects [0..1], simply cut the values to fit the interval.