Fabris, Fabio, Doherty, Aoife, Palmer, Daniel, de Magalhães, João Pedro, Freitas, Alex A. (2018) A new approach for interpreting Random Forest models and its application to the biology of ageing. Bioinformatics, 34 (14). pp. 2449-2456. ISSN 1367-4803. (doi:10.1093/bioinformatics/bty087) (KAR id:66666)
PDF
Publisher pdf
Language: English
This work is licensed under a Creative Commons Attribution 4.0 International License.
|
|
Download this file (PDF/205kB) |
Preview |
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: https://doi.org/10.1093/bioinformatics/bty087 |
Abstract
This work uses the Random Forest (RF) classification algorithm to predict if a gene is over-expressed, under-expressed or has no change in expression with age in the brain. RFs have high predictive power, and RF models can be interpreted using a feature (variable) importance measure. However, current feature importance measures evaluate a feature as a whole (all feature values). We show that, for a popular type of biological data (Gene Ontology-based), usually only one value of a feature is particularly important for classification and the interpretation of the RF model. Hence, we propose a new algorithm for identifying the most important and most informative feature values in an RF model.
Item Type: | Article |
---|---|
DOI/Identification number: | 10.1093/bioinformatics/bty087 |
Uncontrolled keywords: | machine learning, classification, bioinformatics, data mining |
Subjects: | Q Science > Q Science (General) > Q335 Artificial intelligence |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Alex Freitas |
Date Deposited: | 09 Apr 2018 10:44 UTC |
Last Modified: | 05 Nov 2024 11:05 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/66666 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):