Pham, Lam Dang, McLoughlin, Ian Vince, Phan, Huy, Palaniappan, Ramaswamy (2019) A robust framework for acoustic scene classification. In: Interspeech 2019. . pp. 3634-3638. (doi:doi: 10.21437/Interspeech.2019-1841) (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:91418)
The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided. (Contact us about this Publication) | |
Official URL: https://www.isca-speech.org/archive/interspeech_20... |
Abstract
Acoustic scene classification (ASC) using front-end timefrequency features and back-end neural network classifiers has
demonstrated good performance in recent years. However a profusion of systems has arisen to suit different tasks and datasets, utilising different feature and classifier types. This paper aims at a robust framework that can explore and utilise a range of different time-frequency features and neural networks,
either singly or merged, to achieve good classification performance. In particular, we exploit three different types of frontend time-frequency feature; log energy Mel filter, Gammatone filter and constant Q transform. At the back-end we evaluate effective a two-stage model that exploits a Convolutional
Neural Network for pre-trained feature extraction, followed by Deep Neural Network classifiers as a post-trained feature adaptation model and classifier. We also explore the use of a data augmentation technique for these features that effectively generates a variety of intermediate data, reinforcing model learning abilities, particularly for marginal cases. We assess performance on the DCASE2016 dataset, demonstrating good classification accuracies exceeding 90%, significantly outperforming the DCASE2016 baseline and highly competitive compared to state-of-the-art systems.
Item Type: | Conference or workshop item (Proceeding) |
---|---|
DOI/Identification number: | doi: 10.21437/Interspeech.2019-1841 |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Palaniappan Ramaswamy |
Date Deposited: | 08 Nov 2021 11:20 UTC |
Last Modified: | 04 Mar 2024 16:51 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/91418 (The current URI for this page, for reference purposes) |
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):