Improvements on Deep Bottleneck Network based I-Vector Representation for Spoken Language Identification

Song, Yan, Cui, Ruilian, McLoughlin, Ian Vince, Dai, Li-Rong (2016) Improvements on Deep Bottleneck Network based I-Vector Representation for Spoken Language Identification. In: Odyssey 2016: The Speaker and Language Recognition Workshop. (doi:10.21437/odyssey.2016-20) (KAR id:55056)

PDF Author's Accepted Manuscript Language: English This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Download this file (PDF/315kB)	Preview
Request a format suitable for use with assistive technology e.g. a screenreader
Official URL: http://www.isca-speech.org/archive/odyssey_2016/pd...
Additional URLs: http://www.odyssey2016.org

Abstract

Recently, the i-vector representation based on deep bottleneck networks (DBN) pre-trained for automatic speech recognition has received significant interest for both speaker verification (SV) and language identification (LID). In particular, a recent unified DBN based i-vector framework, referred to as DBN-pGMM i-vector, has performed well.

In this paper, we replace the pGMM with a phonetic mixture of factor analyzers (pMFA), and propose a new DBN-pMFA i-vector. The DBN-pMFA i-vector includes the following improvements: (i) a pMFA model is derived from the DBN, which can jointly perform feature dimension reduction and de-correlation in a single linear transformation, (ii) a shifted DBF, termed SDBF, is proposed to exploit the temporal contextual information, (iii) a senone selection scheme is proposed to improve the i-vector extraction efficiently.

We evaluate the proposed DBN-pMFA i-vector on the most confused six languages selected from NIST LRE 2009. The experimental results demonstrate that DBN-pMFA can consistently outperform the previous DBN based framework. The computational complexity can be significantly reduced by applying a simple senone selection scheme.

Item Type:	Conference proceeding
DOI/Identification number:	10.21437/odyssey.2016-20
Subjects:	T Technology
Institutional Unit:	Schools > School of Computing
Former Institutional Unit:	Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User:	Ian McLoughlin
Date Deposited:	19 Apr 2016 11:02 UTC
Last Modified:	28 Apr 2026 08:28 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/55056 (The current URI for this page, for reference purposes)

University of Kent Author Information

McLoughlin, Ian Vince.

Creator's ORCID:	https://orcid.org/0000-0001-7111-2008
CReDIT Contributor Roles:

Depositors only (login required):

Altmetric

Total Views

Total unique views of this page since July 2020. For more details click on the image.