Skip to main content

Listening and grouping: an online autoregressive approach for monaural speech separation

Li, Zheng-xi, Song, Yan, Dai, Li-Rong, McLoughlin, Ian Vince (2019) Listening and grouping: an online autoregressive approach for monaural speech separation. IEEE Transactions On Audio Speech And Language Processing, . ISSN 1558-7916. E-ISSN 2329-9304. (doi:10.1109/TASLP.2019.2892241) (Access to this publication is currently restricted. You may be able to access a copy if URLs are provided)

PDF - Author's Accepted Manuscript
Restricted to Repository staff only until 10 January 2020.
Contact us about this Publication Download (1MB)
[img]
Official URL
http://dx.doi.org/10.1109/TASLP.2019.2892241

Abstract

This paper proposes an autoregressive approach to harness the power of deep learning for multi-speaker monaural speech separation. It exploits a causal temporal context in both mixture and past estimated separated signals and performs online separation that is compatible with real-time applications. The approach adopts a learned listening and grouping architecture motivated by computational auditory scene analysis, with a grouping stage that effectively addresses the label permutation problem at both frame and segment levels. Experimental results on the benchmark WSJ0-2mix dataset show that the new approach can outperform the majority of state-of-the-art methods in both closed-set and open-set conditions in terms of signal-to-distortion ratio (SDR) improvement and perceptual evaluation of speech quality (PESQ), even approaches that exploit whole-utterance statistics for separation, with relatively fewer model parameters.

Item Type: Article
DOI/Identification number: 10.1109/TASLP.2019.2892241
Uncontrolled keywords: Speech separation, deep learning, label permutation problem, computational auditory scene analysis
Subjects: T Technology
Divisions: Faculties > Sciences > School of Computing > Data Science
Depositing User: Ian McLoughlin
Date Deposited: 31 Dec 2018 03:36 UTC
Last Modified: 30 May 2019 08:41 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/71467 (The current URI for this page, for reference purposes)
McLoughlin, Ian Vince: https://orcid.org/0000-0001-7111-2008
  • Depositors only (login required):

Downloads

Downloads per month over past year