Skip to main content

Improved language identification using deep bottleneck network

Song, Yan, Cui, Ruilian, Hong, Xinhai, McLoughlin, Ian Vince, Shi, Jiong, Dai, Lirong (2016) Improved language identification using deep bottleneck network. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). . pp. 4200-4204. IEEE, South Brisbane, QLD (doi:10.1109/ICASSP.2015.7178762) (KAR id:55021)

PDF (Authors accepted version) Author's Accepted Manuscript
Language: English


Creative Commons Licence
This work is licensed under a Creative Commons Attribution 4.0 International License.
Download (201kB) Preview
[img]
Preview
Official URL
http://dx.doi.org/10.1109/ICASSP.2015.7178762

Abstract

Effective representation plays an important role in automatic spoken language identification (LID). Recently, several representations that employ a pre-trained deep neural network (DNN) as the front-end feature extractor, have achieved state-of-the-art performance. However the performance is still far from satisfactory for dialect and short-duration utterance identification tasks, due to the deficiency of existing representations. To address this issue, this paper proposes the improved representations to exploit the information extracted from different layers of the DNN structure. This is conceptually motivated by regarding the DNN as a bridge between low-level acoustic input and high-level phonetic output features. Specifically, we employ deep bottleneck network (DBN), a DNN with an internal bottleneck layer acting as a feature extractor. We extract representations from two layers of this single network, i.e. DBN-TopLayer and DBN-MidLayer. Evaluations on the NIST LRE2009 dataset, as well as the more specific dialect recognition task, show that each representation can achieve an incremental performance gain. Furthermore, a simple fusion of the representations is shown to exceed current state-of-the-art performance.

Item Type: Conference or workshop item (Paper)
DOI/Identification number: 10.1109/ICASSP.2015.7178762
Subjects: T Technology
Divisions: Faculties > Sciences > School of Computing
Depositing User: Ian McLoughlin
Date Deposited: 19 Apr 2016 09:43 UTC
Last Modified: 29 May 2019 17:14 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/55021 (The current URI for this page, for reference purposes)
McLoughlin, Ian Vince: https://orcid.org/0000-0001-7111-2008
  • Depositors only (login required):

Downloads

Downloads per month over past year