Xu, Yan, McLoughlin, Ian Vince, Song, Yan, Wu, Kui (2016) Improved i-Vector Representation for Speaker Diarization. Circuits, Systems, and Signal Processing, 35 . pp. 3393-3404. ISSN 0278-081X. E-ISSN 1531-5878. (doi:10.1007/s00034-015-0206-2) (KAR id:55023)
|
PDF (he final publication is available at Springer via http://link.springer.com/article/10.1007/s00034-015-0206-2/fulltext.html)
Author's Accepted Manuscript
Language: English
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
|
|
|
Download this file (PDF/303kB) |
Preview |
| Request a format suitable for use with assistive technology e.g. a screenreader | |
| Official URL: http://dx.doi.org/10.1007/s00034-015-0206-2 |
|
| Additional URLs: |
|
Abstract
This paper proposes using a previously well-trained deep neural network (DNN) to enhance the i-vector representation used for speaker diarization. In effect, we replace the Gaussian Mixture Model (GMM) typically used to train a Universal Background Model (UBM), with a DNN that has been trained using a different large scale dataset. To train the T-matrix we use a supervised UBM obtained from the DNN using filterbank input features to calculate the posterior information, and then MFCC features to train the UBM instead of a traditional unsupervised UBM derived from single features. Next we jointly use DNN and MFCC features to calculate the zeroth and first order Baum-Welch statistics for training an extractor from which we obtain the i-vector. The system will be shown to achieve a significant improvement on the NIST 2008 speaker recognition evaluation (SRE) telephone data task compared to state-of-the-art approaches.
| Item Type: | Article |
|---|---|
| DOI/Identification number: | 10.1007/s00034-015-0206-2 |
| Uncontrolled keywords: | Speaker diarization; DNN; i-vector; |
| Subjects: | T Technology |
| Institutional Unit: | Schools > School of Computing |
| Former Institutional Unit: |
Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
|
| Depositing User: | Ian McLoughlin |
| Date Deposited: | 19 Apr 2016 10:13 UTC |
| Last Modified: | 20 May 2025 10:18 UTC |
| Resource URI: | https://kar.kent.ac.uk/id/eprint/55023 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):

https://orcid.org/0000-0001-7111-2008
Altmetric
Altmetric