arxiv:1807.03418

Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals

Published on Jul 9, 2018

Authors:

Abstract

Interpretability of deep neural networks is a recently emerging area of machine learning research targeting a better understanding of how models perform feature selection and derive their classification decisions. This paper explores the interpretability of neural networks in the audio domain by using the previously proposed technique of layer-wise <PRE_TAG>relevance propagation</POST_TAG> (LRP). We present a novel audio dataset of English spoken digits which we use for classification tasks on spoken digits and speaker's gender. We use LRP to identify relevant features for two neural network architectures that process either waveform or spectrogram representations of the data. Based on the relevance scores obtained from LRP, hypotheses about the neural networks' feature selection are derived and subsequently tested through systematic manipulations of the input data. The results confirm that the networks are highly reliant on features marked as relevant by LRP.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/1807.03418 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1807.03418 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.