Skip to main content

Improvements on Deep Bottleneck Network based I-Vector Representation for Spoken Language Identification

Song, Yan, Cui, Ruilian, McLoughlin, Ian Vince, Dai, Li-Rong (2016) Improvements on Deep Bottleneck Network based I-Vector Representation for Spoken Language Identification. In: Odyssey 2016: The Speaker and Language Recognition Workshop. . pp. 140-145. (KAR id:55056)

PDF Author's Accepted Manuscript
Language: English


Creative Commons Licence
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Download (205kB) Preview
[img]
Preview
Official URL
http://www.isca-speech.org/archive/odyssey_2016/pd...

Abstract

Recently, the i-vector representation based on deep bottleneck networks (DBN) pre-trained for automatic speech recognition has received significant interest for both speaker verification (SV) and language identification (LID). In particular, a recent unified DBN based i-vector framework, referred to as DBN-pGMM i-vector, has performed well.

We evaluate the proposed DBN-pMFA i-vector on the most confused six languages selected from NIST LRE 2009. The experimental results demonstrate that DBN-pMFA can consistently outperform the previous DBN based framework. The computational complexity can be significantly reduced by applying a simple senone selection scheme.

Item Type: Conference or workshop item (Paper)
Subjects: T Technology
Divisions: Faculties > Sciences > School of Computing
Depositing User: Ian McLoughlin
Date Deposited: 19 Apr 2016 11:02 UTC
Last Modified: 29 May 2019 17:14 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/55056 (The current URI for this page, for reference purposes)
McLoughlin, Ian Vince: https://orcid.org/0000-0001-7111-2008
  • Depositors only (login required):

Downloads

Downloads per month over past year