Skip to main content

End-to-end DNN-CNN Classification for Language Identification

Jin, Ma, Song, Yan, McLoughlin, Ian Vince (2017) End-to-end DNN-CNN Classification for Language Identification. In: Proceedings of The World Congress on Engineering 2017. 1. pp. 119-203. IAENG ISBN 978-988-14-0474-9. (KAR id:61426)

PDF Author's Accepted Manuscript
Language: English
Download (291kB) Preview
[thumbnail of IAENG_LID_JinMa.pdf]
Preview
This file may not be suitable for users of assistive technology.
Request an accessible format
Official URL
http://www.iaeng.org/publication/WCE2017/

Abstract

A defining problem in spoken language identification (LID) is how to design effective representations which allow features to be extracted that are specific to language information.

In this paper, a novel network is proposed and explored that models an effective representation using first and second-order statistics of features extracted from a well-trained phoneme-related DNN bottleneck network followed by a stack of CNN convolutional layers.

Evaluation with NIST LRE 2009 shows improved performance compared to current state-of-the-art systems, achieving over 33% and 20% relative equal error rate (EER) improvement for 3s and 10s utterances.

Item Type: Conference or workshop item (Proceeding)
Subjects: T Technology
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Ian McLoughlin
Date Deposited: 21 Apr 2017 09:28 UTC
Last Modified: 16 Feb 2021 13:44 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/61426 (The current URI for this page, for reference purposes)
McLoughlin, Ian Vince: https://orcid.org/0000-0001-7111-2008
  • Depositors only (login required):

Downloads

Downloads per month over past year