Skip to main content

A new approach for interpreting Random Forest models and its application to the biology of ageing

Fabris, Fabio, Freitas, Alex A. (2018) A new approach for interpreting Random Forest models and its application to the biology of ageing. Bioinformatics, . ISSN 1367-4803. (doi:10.1093/bioinformatics/bty087)

Abstract

This work uses the Random Forest (RF) classification algorithm to predict if a gene is over-expressed, under-expressed or has no change in expression with age in the brain. RFs have high predictive power, and RF models can be interpreted using a feature (variable) importance measure. However, current feature importance measures evaluate a feature as a whole (all feature values). We show that, for a popular type of biological data (Gene Ontology-based), usually only one value of a feature is particularly important for classification and the interpretation of the RF model. Hence, we propose a new algorithm for identifying the most important and most informative feature values in an RF model.

Item Type: Article
DOI/Identification number: 10.1093/bioinformatics/bty087
Uncontrolled keywords: machine learning, classification, bioinformatics, data mining
Subjects: Q Science > Q Science (General) > Q335 Artificial intelligence
Divisions: Faculties > Sciences > School of Computing > Computational Intelligence Group
Depositing User: Alex Freitas
Date Deposited: 09 Apr 2018 10:44 UTC
Last Modified: 29 May 2019 20:26 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/66666 (The current URI for this page, for reference purposes)
Fabris, Fabio: https://orcid.org/0000-0001-7159-4668
  • Depositors only (login required):

Downloads

Downloads per month over past year