Skip to main content

Robust Sound Event Classification using Deep Neural Networks

McLoughlin, Ian Vince, Zhang, Hao-min, Xie, Zhi-Peng, Song, Yan, Xiao, Wei (2015) Robust Sound Event Classification using Deep Neural Networks. Audio, Speech, and Language Processing, IEEE/ACM Transactions on, 23 (3). pp. 540-552. ISSN 2329-9290. (doi:10.1109/TASLP.2015.2389618) (KAR id:51341)

PDF (Robust Sound Event Classification using Deep Neural Networks) Author's Accepted Manuscript
Language: English
Download (902kB) Preview
[thumbnail of Robust Sound Event Classification using Deep Neural Networks]
This file may not be suitable for users of assistive technology.
Request an accessible format
Official URL


The automatic recognition of sound events by computers is an important aspect of emerging applications such as automated surveillance, machine hearing and auditory scene understanding. Recent advances in machine learning, as well as in computational models of the human auditory system, have contributed to advances in this increasingly popular research field. Robust sound event classification, the ability to recognise sounds under real-world noisy conditions, is an especially challenging task. Classification methods translated from the speech recognition domain, using features such as mel-frequency cepstral coefficients, have been shown to perform reasonably well for the sound event classification task, although spectrogram-based or auditory image analysis techniques reportedly achieve superior performance in noise.

This paper outlines a sound event classification framework that compares auditory image front end features with spectrogram image-based front end features, using support vector machine and deep neural network classifiers. Performance is evaluated on a standard robust classification task in different levels of corrupting noise, and with several system enhancements, and shown to compare very well with current state-of-the-art classification techniques.

Item Type: Article
DOI/Identification number: 10.1109/TASLP.2015.2389618
Additional information: Full text upload complies with journal requirements
Uncontrolled keywords: Machine hearing Auditory event detection
Subjects: T Technology > TK Electrical engineering. Electronics. Nuclear engineering > TK5101 Telecommunications > TK5102.9 Signal processing
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Ian McLoughlin
Date Deposited: 02 Nov 2015 11:44 UTC
Last Modified: 16 Feb 2021 13:29 UTC
Resource URI: (The current URI for this page, for reference purposes)
McLoughlin, Ian Vince:
  • Depositors only (login required):


Downloads per month over past year