Pham, Lam Dang, McLoughlin, Ian Vince, Palaniappan, Ramaswamy, Lang, Yue (2019) Bag-of-features models based on C-DNN network for acoustic scene classification. In: 2019 AES International Conference on Audio Forensics (June 2019). . (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:91420)
| The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided. | |
| Official URL: https://www.aes.org/e-lib/online/browse.cfm?elib=2... |
|
Abstract
This work proposes bag-of-features deep learning models for acoustic scene classi?cation (ASC) – identifying recording locations by analyzing background sound. We explore the effect on classi?cation accuracy of various front-end feature extraction techniques, ensembles of audio channels, and patch sizes from three kinds of spectrogram. The back-end process presents a two-stage learning model with a pre-trained CNN (preCNN) and a post-trained DNN (postDNN). Additionally, data augmentation using the mixup technique is investigated for both the pre-trained and post-trained processes, to improve classi?cation accuracy through increasing class boundary training conditions. Our experiments on the 2018 Challenge on Detection and Classi?cation of Acoustic Scenes and Events - Acoustic Scene Classi?cation (DCASE2018-ASC) subtask 1A and 1B signi?cantly outperform the DCASE2018 reference implementation and approach state-of-the-art performance for each task. Results reveal that the ensemble of multi-spectrogram features and data augmentation is bene?cial to performance.
| Item Type: | Conference or workshop item (Proceeding) |
|---|---|
| Institutional Unit: | Schools > School of Computing |
| Former Institutional Unit: |
Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
|
| Depositing User: | Palaniappan Ramaswamy |
| Date Deposited: | 08 Nov 2021 11:24 UTC |
| Last Modified: | 20 May 2025 10:26 UTC |
| Resource URI: | https://kar.kent.ac.uk/id/eprint/91420 (The current URI for this page, for reference purposes) |
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):

https://orcid.org/0000-0001-7111-2008
Total Views
Total Views