McLoughlin, Ian Vince, Zhang, Hao-min, Xie, Zhi-Peng, Song, Yan, Xiao, Wei (2015) Robust Sound Event Classification using Deep Neural Networks. Audio, Speech, and Language Processing, IEEE/ACM Transactions on, 23 (3). pp. 540-552. ISSN 2329-9290. (doi:10.1109/TASLP.2015.2389618) (KAR id:51341)
PDF (Robust Sound Event Classification using Deep Neural Networks)
Author's Accepted Manuscript
Language: English |
|
Download this file (PDF/2MB) |
|
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: http://www.dx.doi.org/10.1109/TASLP.2015.2389618 |
Abstract
The automatic recognition of sound events by computers is an important aspect of emerging applications such as automated surveillance, machine hearing and auditory scene understanding. Recent advances in machine learning, as well as in computational models of the human auditory system, have contributed to advances in this increasingly popular research field. Robust sound event classification, the ability to recognise sounds under real-world noisy conditions, is an especially challenging task. Classification methods translated from the speech recognition domain, using features such as mel-frequency cepstral coefficients, have been shown to perform reasonably well for the sound event classification task, although spectrogram-based or auditory image analysis techniques reportedly achieve superior performance in noise.
This paper outlines a sound event classification framework that compares auditory image front end features with spectrogram image-based front end features, using support vector machine and deep neural network classifiers. Performance is evaluated on a standard robust classification task in different levels of corrupting noise, and with several system enhancements, and shown to compare very well with current state-of-the-art classification techniques.
Item Type: | Article |
---|---|
DOI/Identification number: | 10.1109/TASLP.2015.2389618 |
Additional information: | Full text upload complies with journal requirements |
Uncontrolled keywords: | Machine hearing Auditory event detection |
Subjects: | T Technology > TK Electrical engineering. Electronics. Nuclear engineering > TK5101 Telecommunications > TK5102.9 Signal processing |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Ian McLoughlin |
Date Deposited: | 02 Nov 2015 11:44 UTC |
Last Modified: | 05 Nov 2024 10:37 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/51341 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):