Skip to main content
Kent Academic Repository

Improved language identification using deep bottleneck network

Song, Yan, Cui, Ruilian, Hong, Xinhai, McLoughlin, Ian Vince, Shi, Jiong, Dai, Lirong (2016) Improved language identification using deep bottleneck network. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). . pp. 4200-4204. IEEE, South Brisbane, QLD (doi:10.1109/ICASSP.2015.7178762) (KAR id:55021)

Abstract

Effective representation plays an important role in automatic spoken language identification (LID). Recently, several representations that employ a pre-trained deep neural network (DNN) as the front-end feature extractor, have achieved state-of-the-art performance. However the performance is still far from satisfactory for dialect and short-duration utterance identification tasks, due to the deficiency of existing representations. To address this issue, this paper proposes the improved representations to exploit the information extracted from different layers of the DNN structure. This is conceptually motivated by regarding the DNN as a bridge between low-level acoustic input and high-level phonetic output features. Specifically, we employ deep bottleneck network (DBN), a DNN with an internal bottleneck layer acting as a feature extractor. We extract representations from two layers of this single network, i.e. DBN-TopLayer and DBN-MidLayer. Evaluations on the NIST LRE2009 dataset, as well as the more specific dialect recognition task, show that each representation can achieve an incremental performance gain. Furthermore, a simple fusion of the representations is shown to exceed current state-of-the-art performance.

Item Type: Conference or workshop item (Paper)
DOI/Identification number: 10.1109/ICASSP.2015.7178762
Subjects: T Technology
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Ian McLoughlin
Date Deposited: 19 Apr 2016 09:43 UTC
Last Modified: 05 Nov 2024 10:43 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/55021 (The current URI for this page, for reference purposes)

University of Kent Author Information

McLoughlin, Ian Vince.

Creator's ORCID: https://orcid.org/0000-0001-7111-2008
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.