Zhang, Hao-min and McLoughlin, Ian Vince and Song, Yan (2016) Robust Sound Event Detection in Continuous Audio Environments. In: 17th Annual Conference of the International Speech Communication Association (INTERSPEECH 2016): Understanding Speech Processing in Humans and Machines. Curran Associates, Red Hook, New York, USA. ISBN 978-1-5108-3313-5. (doi:10.21437/Interspeech.2016-392) (KAR id:56309)
PDF
Author's Accepted Manuscript
Language: English
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
|
|
Download this file (PDF/244kB) |
|
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: http://dx.doi.org/10.21437/Interspeech.2016-392 |
Abstract
Sound event detection in real world environments has attracted significant research interest recently because of it's applications in popular fields such as machine hearing and automated surveillance, as well as in sound scene understanding. This paper considers continuous robust sound event detection, which means multiple overlapped sound events in different types of interfering noise. First, a standard evaluation task is outlined based upon existing testing data sets for the sound event classification of isolated sounds. This paper then proposes and evaluates the use of spectrogram image features employing an energy detector to segment sound events, before developing a novel segmentation method making use of a Bayesian inference criteria. At the back end, a convolutional neural network is used to classify detected regions, and this combination is compared to several alternative approaches. The proposed method is shown capable of achieving very good performance compared with current state-of-the-art techniques.
Item Type: | Book section |
---|---|
DOI/Identification number: | 10.21437/Interspeech.2016-392 |
Uncontrolled keywords: | sound event detection, convolutional neural network, Bayesian inference, segmentation |
Subjects: | T Technology |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Ian McLoughlin |
Date Deposited: | 15 Jul 2016 08:24 UTC |
Last Modified: | 05 Nov 2024 10:46 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/56309 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):