max audio model input lenght

by arubittu - opened May 21, 2024

May 21, 2024

what is the maximum audio input lenght I can classify? assuming my sampling lenght is 16 khz. I have tried inferencing with input size up to 100 seconds (100 * 16k size array) and it gives the output. What input size is this model trained to accept? will it have the same performance at larger sizes?

felixbur

audEERING GmbH org May 21, 2024

there is no official max lenght, it'S defined by your ram, but we trained with segmented audio, about 2-6 seconds.
It showed that performance doesn't drop until 3 seconds

arubittu

May 21, 2024

i want to do classification on audio clips of larger lenght , around 1 min, the performance should get better right since I am providing the model with more data to classify?

felixbur

audEERING GmbH org May 21, 2024

•

edited May 21, 2024

i guess best performance would be to segment them and then pool the predictions per speaker, but you could try both and compare

arubittu

May 21, 2024

there is no official max lenght, it'S defined by your ram, but we trained with segmented audio, about 2-6 seconds.
It showed that performance doesn't drop until 3 seconds

did you use dynamic padding for batches? which is why 2 to 6s ?

kopyl

Nov 20, 2024

This comment has been hidden

kopyl

Nov 20, 2024

@felixbur

It showed that performance doesn't drop until 3 seconds

Meaning everything above 3 seconds is worse than 3 seconds or lower? Or am I missing something?
If so, this seems rather unexpected.

felixbur

audEERING GmbH org Nov 20, 2024

sorry, that was badly written,
No: meaning the performance below 3 seconds is worse. From 3 seconds on it's stable.

kopyl

Nov 20, 2024

@felixbur no problem, thank you very much :)

kopyl

Nov 20, 2024

@felixbur what in your opinion is the most optimal audio length for having the best accuracy?

felixbur

audEERING GmbH org Nov 20, 2024

3 seconds and more. if you have several samples per speaker, use majority voting

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment