Skip to main content
Kent Academic Repository

Enabling Early Audio Event Detection With Neural Networks

Phan, Huy, Koch, Philipp, McLoughlin, Ian Vince, Mertins, A. (2018) Enabling Early Audio Event Detection With Neural Networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing Proceedings. . pp. 141-145. IEEE ISBN 978-1-5386-4659-5. E-ISBN 978-1-5386-4658-8. (doi:10.1109/ICASSP.2018.8461859) (KAR id:67162)


This paper presents a methodology for early detection of audio events from audio streams. Early detection is the ability to infer an ongoing event during its initial stage. The proposed system consists of a novel inference step coupled with dual parallel tailored-loss deep neural networks (DNNs). The DNNs share a similar architecture except for their loss functions, i.e. weighted loss and multitask loss, which are designed to efficiently cope with issues common to audio event detection. The inference step is newly introduced to make use of the network output for recognizing ongoing events. The monotonicity of the detection function is required for reliable early detection, and will also be proved. Experiments on the ITC-Irst database show that the proposed system achieves state-of-the-art detection performance. Furthermore, even partial events are sufficient to achieve good performance similar to that obtained when an entire event is observed, enabling early event detection.

Item Type: Conference or workshop item (Proceeding)
DOI/Identification number: 10.1109/ICASSP.2018.8461859
Subjects: T Technology
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Ian McLoughlin
Date Deposited: 30 May 2018 12:05 UTC
Last Modified: 08 Dec 2022 21:20 UTC
Resource URI: (The current URI for this page, for reference purposes)

University of Kent Author Information

  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.