Learning Compact Structural Representations For Audio Events Using Regressor Banks

Phan, Huy, Maass, Marco, Hertel, Lars, Mazur, Radoslaw, McLoughlin, Ian Vince, Merins, Alfred (2016) Learning Compact Structural Representations For Audio Events Using Regressor Banks. In: 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings. IEEE ISBN 978-1-4799-9988-0. (doi:10.1109/ICASSP.2016.7471667) (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:55054)

The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided.
Official URL: https://doi.org/10.1109/ICASSP.2016.7471667

Abstract

We introduce a new learned descriptor for audio signals which is

efficient for event representation. The entries of the descriptor are

produced by evaluating a set of regressors on the input signal. The

regressors are class-specific and trained using the random regression

forests framework. Given an input signal, each regressor estimates

the onset and offset positions of the target event. The estimation con-

fidence scores output by a regressor are then used to quantify how

the target event aligns with the temporal structure of the corresponding

category. Our proposed descriptor has two advantages. First, it

is compact, i.e. the dimensionality of the descriptor is equal to the

number of event classes. Second, we show that even simple linear

classification models, trained on our descriptor, yield better accuracies

on audio event classification task than not only the nonlinear

baselines but also the state-of-the-art results.

Item Type:	Conference proceeding
DOI/Identification number:	10.1109/ICASSP.2016.7471667
Uncontrolled keywords:	feature learning; audio event; recognition; structural encoding
Subjects:	T Technology
Institutional Unit:	Schools > School of Computing
Former Institutional Unit:	Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User:	Ian McLoughlin
Date Deposited:	19 Apr 2016 10:52 UTC
Last Modified:	28 Apr 2026 08:28 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/55054 (The current URI for this page, for reference purposes)

University of Kent Author Information

Phan, Huy.

Creator's ORCID:	https://orcid.org/0000-0003-4096-785X
CReDIT Contributor Roles:

McLoughlin, Ian Vince.

Creator's ORCID:	https://orcid.org/0000-0001-7111-2008
CReDIT Contributor Roles:

Depositors only (login required):

Altmetric

Total Views

Total unique views of this page since July 2020. For more details click on the image.