Robust sound event recognition using convolutional neural networks

Zhang, Haomin, McLoughlin, Ian Vince, Song, Yan (2016) Robust sound event recognition using convolutional neural networks. In: IEEE International Conference on Acoustics Speech and Signal Processing. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). . pp. 559-563. Institute of Electrical and Electronics Engineers, South Brisbane, QLD (doi:10.1109/ICASSP.2015.7178031) (KAR id:55020)

PDF (Robust Sound Event Recognition Using Convolutional Neural Networks) Author's Accepted Manuscript Language: English This work is licensed under a Creative Commons Attribution 4.0 International License.
Download this file (PDF/531kB)	Preview
Request a format suitable for use with assistive technology e.g. a screenreader
Official URL: http://dx.doi.org/10.1109/ICASSP.2015.7178031

Abstract

Traditional sound event recognition methods based on informative front end features such as MFCC, with back end sequencing methods such as HMM, tend to perform poorly in the presence of interfering acoustic noise.

Since noise corruption may be unavoidable in practical situations, it is important to develop more robust features and classifiers. Recent advances in this field use powerful machine learning techniques with high dimensional input features such as spectrograms or auditory image. These improve robustness largely thanks to the discriminative capabilities of the back end classifiers. We extend this further by proposing novel features derived from spectrogram energy triggering, allied with the powerful classification capabilities of a convolutional neural network (CNN). The proposed method demonstrates excellent performance under noise-corrupted conditions when compared against state-of-the-art approaches on standard evaluation tasks. To the author's knowledge this in the first application of CNN in this field.

Item Type:	Conference or workshop item (Paper)
DOI/Identification number:	10.1109/ICASSP.2015.7178031
Uncontrolled keywords:	Machine hearing; auditory event detection; convolutional neural networks;
Subjects:	T Technology
Divisions:	Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User:	Ian McLoughlin
Date Deposited:	19 Apr 2016 09:40 UTC
Last Modified:	05 Nov 2024 10:43 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/55020 (The current URI for this page, for reference purposes)

University of Kent Author Information

McLoughlin, Ian Vince.

Creator's ORCID:	https://orcid.org/0000-0001-7111-2008
CReDIT Contributor Roles:

Depositors only (login required):

Altmetric

Total Views

Total unique views of this page since July 2020. For more details click on the image.