Czajkowski, Marcin, Grzes, Marek, Kretowski, Marek (2014) Multi-test Decision Tree and its Application to Microarray Data Classification. Artificial Intelligence in Medicine, 61 (1). pp. 35-44. ISSN 0933-3657. (doi:10.1016/j.artmed.2014.01.005) (KAR id:48654)
PDF
Publisher pdf
Language: English |
|
Download this file (PDF/984kB) |
Preview |
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: http://dx.doi.org/10.1016/j.artmed.2014.01.005 |
Abstract
Objective:
The desirable property of tools used to investigate biological data is
easy to understand models and predictive decisions.
Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity.
Methods:
We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions.
Results:
Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on $14$ datasets by an average $6$ percent. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model
are supported by biological evidence in the literature.
Conclusion:
This paper introduces a new type of decision tree which is more suitable for solving biological problems.
MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts.
Item Type: | Article |
---|---|
DOI/Identification number: | 10.1016/j.artmed.2014.01.005 |
Uncontrolled keywords: | Decision trees; univariate tests; underfitting; gene expression data |
Subjects: |
Q Science Q Science > Q Science (General) > Q335 Artificial intelligence |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Marek Grzes |
Date Deposited: | 26 May 2015 19:57 UTC |
Last Modified: | 05 Nov 2024 10:32 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/48654 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):