Skip to main content

Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks

McLoughlin, Ian Vince, Song, Yan (2015) Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks. In: Proc. Interspeech 2015. . (Unpublished) (KAR id:50258)

PDF (Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks) Author's Accepted Manuscript
Language: English
Download (304kB) Preview
[img]
Preview
Official URL
http://interspeech2015.org

Abstract

Low frequency ultrasonic mouth state detection uses reflected audio chirps from the face in the region of the mouth to determine lip state, whether open, closed or partially open.

To determine mouth open or closed state, and hence form a measure of voice activity detection, this recently invented technique relies upon the difference in the reflected chirp caused by resonances introduced by the open or partially open mouth cavity.

This paper introduces a new metric based on spectrogram features extracted from the reflected chirp, with a convolutional neural network classification back-end, that yields excellent performance without needing the periodic resetting of the template closed-mouth reflection required by the original technique.

Item Type: Conference or workshop item (Speech)
Uncontrolled keywords: Voice activity detection, speech activity detection, ultrasonic speech, SaVAD
Subjects: T Technology
Divisions: Faculties > Sciences > School of Computing
Depositing User: Ian McLoughlin
Date Deposited: 21 Aug 2015 10:02 UTC
Last Modified: 29 May 2019 15:56 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/50258 (The current URI for this page, for reference purposes)
McLoughlin, Ian Vince: https://orcid.org/0000-0001-7111-2008
  • Depositors only (login required):

Downloads

Downloads per month over past year