Omodunbi, Bolaji A., Olawade, David B., Awe, Omosigho F., Soladoye, Afeez A., Aderinto, Nicholas, Ovsepian, Saak V., Boussios, Stergios (2025) Stacked Ensemble Learning for Classification of Parkinson's Disease Using Telemonitoring Vocal Features. Diagnostics, 15 (12). Article Number 1467. ISSN 2075-4418. (doi:10.3390/diagnostics15121467) (KAR id:110502)
|
PDF
Publisher pdf
Language: English
This work is licensed under a Creative Commons Attribution 4.0 International License.
|
|
|
Download this file (PDF/448kB) |
Preview |
| Request a format suitable for use with assistive technology e.g. a screenreader | |
| Official URL: https://doi.org/10.3390/diagnostics15121467 |
|
Abstract
Background: Parkinson's disease (PD) is a progressive neurodegenerative condition that impairs motor and non-motor functions. Early and accurate diagnosis is critical for effective management and care. Leveraging machine learning (ML) techniques, this study aimed to develop a robust prediction system for PD using a stacked ensemble learning approach, addressing challenges such as imbalanced datasets and feature optimization. Methods: An open-access PD dataset comprising 22 vocal attributes and 195 instances from 31 subjects was utilized. To prevent data leakage, subjects were divided into training (22 subjects) and testing (9 subjects) groups, ensuring no subject appeared in both sets. Preprocessing included data cleaning and normalization via min-max scaling. The synthetic minority oversampling technique (SMOTE) was applied exclusively to the training set to address class imbalance. Feature selection techniques-forward search, gain ratio, and Kruskal-Wallis test-were employed using subject-wise cross-validation to identify significant attributes. The developed system combined support vector machine (SVM), random forest (RF), K-nearest neighbor (KNN), and decision tree (DT) as base classifiers, with logistic regression (LR) as the meta-classifier in a stacked ensemble learning framework. Performance was evaluated using both recording-wise and subject-wise metrics to ensure clinical relevance. Results: The stacked ensemble learning model achieved realistic performance with a recording-wise accuracy of 84.7% and subject-wise accuracy of 77.8% on completely unseen subjects, outperforming individual classifiers including KNN (81.4%), RF (79.7%), and SVM (76.3%). Cross-validation within the training set showed 89.2% accuracy, with the performance difference highlighting the importance of proper validation methodology. Feature selection results showed that using the top 10 features ranked by gain ratio provided optimal balance between performance and clinical interpretability. The system's methodological robustness was validated through rigorous subject-wise evaluation, demonstrating the critical impact of validation methodology on reported performance. Conclusions: By implementing subject-wise validation and preventing data leakage, this study demonstrates that proper validation yields substantially different (and more realistic) results compared to flawed recording-wise approaches. The findings underscore the critical importance of validation methodology in healthcare ML applications and provide a template for methodologically sound PD classification research. Future research should focus on validating the model with larger, multi-center datasets and implementing standardized validation protocols to enhance clinical applicability.
| Item Type: | Article |
|---|---|
| DOI/Identification number: | 10.3390/diagnostics15121467 |
| Uncontrolled keywords: | Parkinson’s Disease, Feature Selection, Machine Learning, Predictive Analytics, Stacked Ensemble Learning |
| Subjects: | R Medicine |
| Institutional Unit: | Schools > Kent and Medway Medical School |
| Former Institutional Unit: |
There are no former institutional units.
|
| Funders: | University of Kent (https://ror.org/00xkeyj56) |
| SWORD Depositor: | JISC Publications Router |
| Depositing User: | JISC Publications Router |
| Date Deposited: | 12 Sep 2025 13:11 UTC |
| Last Modified: | 15 Sep 2025 11:40 UTC |
| Resource URI: | https://kar.kent.ac.uk/id/eprint/110502 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):

https://orcid.org/0000-0002-2512-6131
Altmetric
Altmetric