Skip to main content

Time-Frequency Feature Fusion for Noise-Robust Audio Event Classification

McLoughlin, Ian, Xie, Zhi-Peng, Song, Yan, Phan, Huy, Ramaswamy, Palaniappan (2020) Time-Frequency Feature Fusion for Noise-Robust Audio Event Classification. Circuits, Systems, and Signal Processing, 39 . pp. 1672-1687. ISSN 0278-081X. E-ISSN 1531-5878. (doi:10.1007/s00034-019-01203-0) (KAR id:75276)

PDF Publisher pdf
Language: English


Download (1MB)
[thumbnail of 10.1007_s00034-019-01203-0.pdf]
This file may not be suitable for users of assistive technology.
Request an accessible format
Official URL:
https://doi.org/10.1007/s00034-019-01203-0

Abstract

This paper explores the use of three different two-dimensional time-frequency features for audio event classification with deep neural network back-end classifiers. The evaluations use spectrogram, cochleogram and constant-Q transform based images for classification of 50 classes of audio events in varying levels of acoustic background noise, revealing interesting performance patterns with respect to noise level, feature image type and classifier. Evidence is obtained that two well-performing features, the spectrogram and cochleogram, make use of information that is potentially complementary in the input features. Feature fusion is thus explored for each pair of features, as well as for all tested features. Results indicate that a fusion of spectrogram and cochleogram information is particularly beneficial, yielding an impressive 50-class accuracy of over 96% in 0dB SNR, and exceeding 99% accuracy in 10dB SNR and above. Meanwhile the cochleogram image feature is found to perform well in extreme noise cases of -5dB and -10dB SNR.

Item Type: Article
DOI/Identification number: 10.1007/s00034-019-01203-0
Uncontrolled keywords: Audio event classification, deep neural network, convolutional neural network, time-frequency image features
Subjects: T Technology
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Ian McLoughlin
Date Deposited: 09 Jul 2019 21:00 UTC
Last Modified: 08 Dec 2022 21:48 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/75276 (The current URI for this page, for reference purposes)
McLoughlin, Ian: https://orcid.org/0000-0001-7111-2008
Phan, Huy: https://orcid.org/0000-0003-4096-785X
Ramaswamy, Palaniappan: https://orcid.org/0000-0001-5296-8396
  • Depositors only (login required):

Downloads

Downloads per month over past year