Phan, Huy, Maass, Marco, Hertel, Lars, Mazur, Radoslaw, McLoughlin, Ian Vince, Merins, Alfred (2016) Learning Compact Structural Representations For Audio Events Using Regressor Banks. In: 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings. . IEEE ISBN 978-1-4799-9988-0. (doi:10.1109/ICASSP.2016.7471667) (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:55054)
The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided. | |
Official URL: https://doi.org/10.1109/ICASSP.2016.7471667 |
Abstract
We introduce a new learned descriptor for audio signals which is
efficient for event representation. The entries of the descriptor are
produced by evaluating a set of regressors on the input signal. The
regressors are class-specific and trained using the random regression
forests framework. Given an input signal, each regressor estimates
the onset and offset positions of the target event. The estimation con-
fidence scores output by a regressor are then used to quantify how
the target event aligns with the temporal structure of the corresponding
category. Our proposed descriptor has two advantages. First, it
is compact, i.e. the dimensionality of the descriptor is equal to the
number of event classes. Second, we show that even simple linear
classification models, trained on our descriptor, yield better accuracies
on audio event classification task than not only the nonlinear
baselines but also the state-of-the-art results.
Item Type: | Conference or workshop item (Proceeding) |
---|---|
DOI/Identification number: | 10.1109/ICASSP.2016.7471667 |
Uncontrolled keywords: | feature learning; audio event; recognition; structural encoding |
Subjects: | T Technology |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Ian McLoughlin |
Date Deposited: | 19 Apr 2016 10:52 UTC |
Last Modified: | 05 Nov 2024 10:43 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/55054 (The current URI for this page, for reference purposes) |
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):