Skip to main content

Random Regression Forests for Acoustic Event Detection and Classification

Phan, Huy, Maaß, Marco, Mazur, Radoslaw, Mertins, Alfred (2014) Random Regression Forests for Acoustic Event Detection and Classification. IEEE/ACM Transactions on Audio, Speech and Language Processing, 23 (1). pp. 20-31. ISSN 2329-9290. E-ISSN 2329-9304. (doi:10.1109/TASLP.2014.2367814) (KAR id:72691)

PDF Author's Accepted Manuscript
Language: English
Download (916kB) Preview
[img]
Preview
Official URL
https://doi.org/10.1109/TASLP.2014.2367814

Abstract

Despite the success of the automatic speech recognition framework in its own application field, its adaptation to the problem of acoustic event detection has resulted in limited success. In this paper, instead of treating the problem similar to the segmentation and classification tasks in speech recognition, we pose it as a regression task and propose an approach based on random forest regression. Furthermore, event localization in time can be efficiently handled as a joint problem. We first decompose the training audio signals into multiple interleaved superframes which are annotated with the corresponding event class labels and their displacements to the temporal onsets and offsets of the events. For a specific event category, a random-forest regression model is learned using the displacement information. Given an unseen superframe, the learned regressor will output the continuous estimates of the onset and offset locations of the events. To deal with multiple event categories, prior to the category-specific regression phase, a superframe-wise recognition phase is performed to reject the background superframes and to classify the event superframes into different event categories. While jointly posing event detection and localization as a regression problem is novel, the superior performance on two databases ITC-Irst and UPC-TALP demonstrates the efficiency and potential of the proposed approach.

Item Type: Article
DOI/Identification number: 10.1109/TASLP.2014.2367814
Uncontrolled keywords: acoustic event detection, regression forest, random forest, superframe
Divisions: Faculties > Sciences > School of Computing
Depositing User: Huy Phan
Date Deposited: 25 Feb 2019 17:01 UTC
Last Modified: 03 Jun 2019 09:28 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/72691 (The current URI for this page, for reference purposes)
Phan, Huy: https://orcid.org/0000-0003-4096-785X
  • Depositors only (login required):

Downloads

Downloads per month over past year