Song, Yan, Cui, Ruilian, McLoughlin, Ian Vince, Dai, Li-Rong (2016) Improvements on Deep Bottleneck Network based I-Vector Representation for Spoken Language Identification. In: Odyssey 2016: The Speaker and Language Recognition Workshop. . pp. 140-145. (doi:10.21437/odyssey.2016-20) (KAR id:55056)
PDF
Author's Accepted Manuscript
Language: English
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
|
|
Download this file (PDF/315kB) |
Preview |
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: http://www.isca-speech.org/archive/odyssey_2016/pd... |
Abstract
Recently, the i-vector representation based on deep bottleneck networks (DBN) pre-trained for automatic speech recognition has received significant interest for both speaker verification (SV) and language identification (LID). In particular, a recent unified DBN based i-vector framework, referred to as DBN-pGMM i-vector, has performed well.
In this paper, we replace the pGMM with a phonetic mixture of factor analyzers (pMFA), and propose a new DBN-pMFA i-vector. The DBN-pMFA i-vector includes the following improvements: (i) a pMFA model is derived from the DBN, which can jointly perform feature dimension reduction and de-correlation in a single linear transformation, (ii) a shifted DBF, termed SDBF, is proposed to exploit the temporal contextual information, (iii) a senone selection scheme is proposed to improve the i-vector extraction efficiently.
We evaluate the proposed DBN-pMFA i-vector on the most confused six languages selected from NIST LRE 2009. The experimental results demonstrate that DBN-pMFA can consistently outperform the previous DBN based framework. The computational complexity can be significantly reduced by applying a simple senone selection scheme.
Item Type: | Conference or workshop item (Paper) |
---|---|
DOI/Identification number: | 10.21437/odyssey.2016-20 |
Subjects: | T Technology |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Ian McLoughlin |
Date Deposited: | 19 Apr 2016 11:02 UTC |
Last Modified: | 05 Nov 2024 10:43 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/55056 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):