Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks

McLoughlin, Ian Vince, Song, Yan (2015) Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks. In: Proc. Interspeech 2015. (Unpublished) (KAR id:50258)

PDF (Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks) Author's Accepted Manuscript Language: English
Download this file (PDF/442kB)	Preview
Request a format suitable for use with assistive technology e.g. a screenreader
Official URL: http://interspeech2015.org

Abstract

Low frequency ultrasonic mouth state detection uses reflected audio chirps from the face in the region of the mouth to determine lip state, whether open, closed or partially open.

The chirps are located in a frequency range just above the threshold of human hearing and are thus both inaudible as well as unaffected by interfering speech, yet can be produced and sensed using inexpensive equipment.

To determine mouth open or closed state, and hence form a measure of voice activity detection, this recently invented technique relies upon the difference in the reflected chirp caused by resonances introduced by the open or partially open mouth cavity.

Voice activity is then inferred from lip state through patterns of mouth movement, in a similar way to video-based lip-reading technologies.

This paper introduces a new metric based on spectrogram features extracted from the reflected chirp, with a convolutional neural network classification back-end, that yields excellent performance without needing the periodic resetting of the template closed-mouth reflection required by the original technique.

Item Type:	Conference proceeding
Uncontrolled keywords:	Voice activity detection, speech activity detection, ultrasonic speech, SaVAD
Subjects:	T Technology
Institutional Unit:	Schools > School of Computing
Former Institutional Unit:	Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User:	Ian McLoughlin
Date Deposited:	21 Aug 2015 10:02 UTC
Last Modified:	28 Apr 2026 08:19 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/50258 (The current URI for this page, for reference purposes)

University of Kent Author Information

McLoughlin, Ian Vince.

Creator's ORCID:	https://orcid.org/0000-0001-7111-2008
CReDIT Contributor Roles:

Depositors only (login required):

Total Views

Total unique views of this page since July 2020. For more details click on the image.