Song, Yan, Cui, Ruilian, Hong, Xinhai, McLoughlin, Ian Vince, Shi, Jiong, Dai, Lirong (2016) Improved language identification using deep bottleneck network. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). . pp. 4200-4204. IEEE, South Brisbane, QLD (doi:10.1109/ICASSP.2015.7178762) (KAR id:55021)
PDF (Authors accepted version)
Author's Accepted Manuscript
Language: English
This work is licensed under a Creative Commons Attribution 4.0 International License.
|
|
Download this file (PDF/295kB) |
|
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: http://dx.doi.org/10.1109/ICASSP.2015.7178762 |
Abstract
Effective representation plays an important role in automatic spoken language identification (LID). Recently, several representations that employ a pre-trained deep neural network (DNN) as the front-end feature extractor, have achieved state-of-the-art performance. However the performance is still far from satisfactory for dialect and short-duration utterance identification tasks, due to the deficiency of existing representations. To address this issue, this paper proposes the improved representations to exploit the information extracted from different layers of the DNN structure. This is conceptually motivated by regarding the DNN as a bridge between low-level acoustic input and high-level phonetic output features. Specifically, we employ deep bottleneck network (DBN), a DNN with an internal bottleneck layer acting as a feature extractor. We extract representations from two layers of this single network, i.e. DBN-TopLayer and DBN-MidLayer. Evaluations on the NIST LRE2009 dataset, as well as the more specific dialect recognition task, show that each representation can achieve an incremental performance gain. Furthermore, a simple fusion of the representations is shown to exceed current state-of-the-art performance.
Item Type: | Conference or workshop item (Paper) |
---|---|
DOI/Identification number: | 10.1109/ICASSP.2015.7178762 |
Subjects: | T Technology |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Ian McLoughlin |
Date Deposited: | 19 Apr 2016 09:43 UTC |
Last Modified: | 05 Nov 2024 10:43 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/55021 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):