Skip to main content

End-to-end DNN-CNN Classification for Language Identification

Jin, Ma, Song, Yan, McLoughlin, Ian Vince (2017) End-to-end DNN-CNN Classification for Language Identification. In: Proceedings of The World Congress on Engineering 2017. 1. pp. 119-203. IAENG ISBN 978-988-14-0474-9. (KAR id:61426)

PDF Author's Accepted Manuscript
Language: English
Download (291kB) Preview
[img]
Preview
Official URL
http://www.iaeng.org/publication/WCE2017/

Abstract

A defining problem in spoken language identification (LID) is how to design effective representations which allow features to be extracted that are specific to language information.

In this paper, a novel network is proposed and explored that models an effective representation using first and second-order statistics of features extracted from a well-trained phoneme-related DNN bottleneck network followed by a stack of CNN convolutional layers.

Evaluation with NIST LRE 2009 shows improved performance compared to current state-of-the-art systems, achieving over 33% and 20% relative equal error rate (EER) improvement for 3s and 10s utterances.

Item Type: Conference or workshop item (Proceeding)
Subjects: T Technology
Divisions: Faculties > Sciences > School of Computing > Data Science
Depositing User: Ian McLoughlin
Date Deposited: 21 Apr 2017 09:28 UTC
Last Modified: 09 Jul 2019 11:19 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/61426 (The current URI for this page, for reference purposes)
McLoughlin, Ian Vince: https://orcid.org/0000-0001-7111-2008
  • Depositors only (login required):

Downloads

Downloads per month over past year