McLoughlin, Ian Vince, Song, Yan (2015) Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks. In: Proc. Interspeech 2015. . (Unpublished) (KAR id:50258)
PDF (Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks)
Author's Accepted Manuscript
Language: English |
|
Download this file (PDF/442kB) |
Preview |
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: http://interspeech2015.org |
Abstract
Low frequency ultrasonic mouth state detection uses reflected audio chirps from the face in the region of the mouth to determine lip state, whether open, closed or partially open.
The chirps are located in a frequency range just above the threshold of human hearing and are thus both inaudible as well as unaffected by interfering speech, yet can be produced and sensed using inexpensive equipment.
To determine mouth open or closed state, and hence form a measure of voice activity detection, this recently invented technique relies upon the difference in the reflected chirp caused by resonances introduced by the open or partially open mouth cavity.
Voice activity is then inferred from lip state through patterns of mouth movement, in a similar way to video-based lip-reading technologies.
This paper introduces a new metric based on spectrogram features extracted from the reflected chirp, with a convolutional neural network classification back-end, that yields excellent performance without needing the periodic resetting of the template closed-mouth reflection required by the original technique.
Item Type: | Conference or workshop item (Speech) |
---|---|
Uncontrolled keywords: | Voice activity detection, speech activity detection, ultrasonic speech, SaVAD |
Subjects: | T Technology |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Ian McLoughlin |
Date Deposited: | 21 Aug 2015 10:02 UTC |
Last Modified: | 05 Nov 2024 10:35 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/50258 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):