Skip to main content
Kent Academic Repository

Interpretable Ensembles of Classifiers for Uncertain Data with Bioinformatics Applications

Maia, Marcelo Rodrigues de Holanda, Plastino, Alexandre, Freitas, Alex A., de Magalhaes, Joao Pedro (2022) Interpretable Ensembles of Classifiers for Uncertain Data with Bioinformatics Applications. IEEE/ACM Transactions on Computational Biology and Bioinformatics, . pp. 1-12. ISSN 1557-9964. (doi:10.1109/tcbb.2022.3218588) (KAR id:98040)

Abstract

Data uncertainty remains a challenging issue in many applications, but few classification algorithms can effectively cope with it. An ensemble approach for uncertain categorical features has recently been proposed, achieving promising results. It consists in biasing the sampling of features for each model in an ensemble so that less uncertain features are more likely to be sampled. Here we extend this idea of biased sampling and propose two new approaches: one for selecting training instances for each model in an ensemble and another for sampling features to be considered when splitting a node in a Random Forest training. We applied these approaches to classify ageing-related genes and predict drugs' side effects based on uncertain features representing protein-protein and protein-chemical interactions. We show that ensembles based on our proposed approaches achieve better predictive performance. In particular, our proposed approaches improved the performance of a Random Forest based on the most sophisticated approach for handling uncertain data in ensembles of this kind. Furthermore, we propose two new approaches for interpreting an ensemble of Naive Bayes classifiers and analyse their results on our datasets of ageing-related genes and drug's side effects.

Item Type: Article
DOI/Identification number: 10.1109/tcbb.2022.3218588
Additional information: ** Article version: VoR ** From Crossref journal articles via Jisc Publications Router ** Licence for VoR version of this article starting on 01-01-2022: https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
Uncontrolled keywords: Applied Mathematics, Genetics, Biotechnology
Subjects: Q Science > QA Mathematics (inc Computing science)
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Funders: University of Kent (https://ror.org/00xkeyj56)
SWORD Depositor: JISC Publications Router
Depositing User: JISC Publications Router
Date Deposited: 30 Nov 2022 14:49 UTC
Last Modified: 05 Nov 2024 13:03 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/98040 (The current URI for this page, for reference purposes)

University of Kent Author Information

  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.