Jin, Ma, Song, Yan, McLoughlin, Ian Vince (2017) End-to-end DNN-CNN Classification for Language Identification. In: Proceedings of The World Congress on Engineering 2017. 1. pp. 119-203. IAENG ISBN 978-988-14-0474-9. (KAR id:61426)
PDF
Author's Accepted Manuscript
Language: English |
|
Download (291kB)
Preview
|
Preview |
This file may not be suitable for users of assistive technology.
Request an accessible format
|
|
Official URL http://www.iaeng.org/publication/WCE2017/ |
Abstract
A defining problem in spoken language identification (LID) is how to design effective representations which allow features to be extracted that are specific to language information.
In this paper, a novel network is proposed and explored that models an effective representation using first and second-order statistics of features extracted from a well-trained phoneme-related DNN bottleneck network followed by a stack of CNN convolutional layers.
Evaluation with NIST LRE 2009 shows improved performance compared to current state-of-the-art systems, achieving over 33% and 20% relative equal error rate (EER) improvement for 3s and 10s utterances.
Item Type: | Conference or workshop item (Proceeding) |
---|---|
Subjects: | T Technology |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Ian McLoughlin |
Date Deposited: | 21 Apr 2017 09:28 UTC |
Last Modified: | 16 Feb 2021 13:44 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/61426 (The current URI for this page, for reference purposes) |
McLoughlin, Ian Vince: | ![]() |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):