Skip to main content
Kent Academic Repository

Spectral Enhancement of Whispered Speech Based on Probability Mass Function

Sharifzadeh, Hamid Reza and McLoughlin, Ian Vince and Ahmadi, Farzaneh (2010) Spectral Enhancement of Whispered Speech Based on Probability Mass Function. In: 2010 Sixth Advanced International Conference on Telecommunications. IEEE, pp. 207-211. ISBN 978-1-4244-6748-8. E-ISBN 978-1-4244-6749-5. (doi:10.1109/AICT.2010.47) (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:48916)

The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided.
Official URL:
http://dx.doi.org/10.1109/AICT.2010.47

Abstract

Whispered speech can be effectively used for quiet and private communications over mobile phones and is also the communication means for ENT patients under a regime of voice rest. The reconstruction of natural sounding speech from such whispers can be useful for several types of application across different scientific fields ranging from communications to biomedical engineering. Despite the useful applications for a such technology, the reconstruction of natural speech from whispers has received relatively little research effort to date. This paper presents novel methods for spectral enhancement and formant smoothing with the aim of attaining more natural sounding speech within the reconstruction process. The proposed approach uses a probability mass-density function to identify a reliable formant trajectory through whispers and apply vocal modifications accordingly. Subjective evaluation experiments were performed, and are reported, to assess the performance of the techniques. A method for the near real-time conversion of whispers to normal phonated speech through a modified CELP codec has been discussed in our previously published work which, the proposed formant modification approach in this paper builds upon.

Item Type: Book section
DOI/Identification number: 10.1109/AICT.2010.47
Uncontrolled keywords: speech enhancement; speech codecs; mobile communication; mobile handsets; natural languages; smoothing methods; speech processing; speech coding; working environment noise; frequency estimation
Subjects: T Technology
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Ian McLoughlin
Date Deposited: 07 Sep 2015 15:10 UTC
Last Modified: 16 Nov 2021 10:20 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/48916 (The current URI for this page, for reference purposes)

University of Kent Author Information

McLoughlin, Ian Vince.

Creator's ORCID: https://orcid.org/0000-0001-7111-2008
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.