Wan, Cen, Freitas, Alex A., de Magalhaes, João Pedro (2015) Predicting the pro-longevity or anti-longevity effect of model organism genes with new hierarchical feature selection methods. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 12 (2). pp. 262-275. ISSN 1545-5963. (doi:10.1109/TCBB.2014.2355218) (Access to this publication is currently restricted. You may be able to access a copy if URLs are provided) (KAR id:47996)
PDF
Language: English Restricted to Repository staff only |
|
|
|
Official URL: http://dx.doi.org/10.1109/TCBB.2014.2355218 |
Abstract
Ageing is a highly complex biological process that is still poorly understood. With the growing amount of ageing-related data available on the web, in particular concerning the genetics of ageing, it is timely to apply data mining methods to that data, in order to try to discover novel patterns that may assist ageing research. In this work, we introduce new hierarchical feature selection methods for the classification task of data mining and apply them to ageing-related data from four model organisms: Caenorhabditis elegans (worm), Saccharomyces cerevisiae (yeast), Drosophila melanogaster (fly), and Mus musculus (mouse). The main novel aspect of the proposed feature selection methods is that they exploit hierarchical relationships in the set of features (Gene Ontology terms) in order to improve the predictive accuracy of the Naïve Bayes and 1-Nearest Neighbour (1-NN) classifiers, which are used to classify model organisms’ genes into pro-longevity or anti-longevity genes. The results show that our hierarchical feature selection methods, when used together with Naïve Bayes and 1-NN classifiers, obtain higher predictive accuracy than the standard (without feature selection) Naïve Bayes and 1-NN classifiers, respectively. We also discuss the biological relevance of a number of Gene Ontology terms very frequently selected by our algorithms in our datasets.
Item Type: | Article |
---|---|
DOI/Identification number: | 10.1109/TCBB.2014.2355218 |
Uncontrolled keywords: | data mining, machine learning, hierarchical feature selection, ageing, bioinformatics |
Subjects: | Q Science > Q Science (General) > Q335 Artificial intelligence |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Alex Freitas |
Date Deposited: | 17 Apr 2015 15:17 UTC |
Last Modified: | 05 Nov 2024 10:31 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/47996 (The current URI for this page, for reference purposes) |
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):