Hierarchical classification of protein function with ensembles of rules and particle swarm optimisation

Holden, Nicholas and Freitas, Alex A. (2009) Hierarchical classification of protein function with ensembles of rules and particle swarm optimisation. Soft Computing, 13 (3). pp. 259-272. ISSN 1432-7643. (The full text of this publication is not available from this repository)

The full text of this publication is not available from this repository. (Contact us about this Publication)
Official URL
http://dx.doi.org/10.1007/s00500-008-0321-0

Abstract

This paper focuses on hierarchical classification problems where the classes to be predicted are organized in the form of a tree. The standard top-down divide and conquer approach for hierarchical classification consists of building a hierarchy of classifiers where a classifier is built for each internal (non-leaf) node in the class tree. Each classifier discriminates only between its child classes. After the tree of classifiers is built, the system uses them to classify test examples one class level at a time, so that when the example is assigned a class at a given level, only the child classes need to be considered at the next level. This approach has the drawback that, if a test example is misclassified at a certain class level, it will be misclassified at deeper levels too. In this paper we propose hierarchical classification methods to mitigate this drawback. More precisely, we propose a method called hierarchical ensemble of hierarchical rule sets (HEHRS), where different ensembles are built at different levels in the class tree and each ensemble consists of different rule sets built from training examples at different levels of the class tree. We also use a particle swarm optimisation (PSO) algorithm to optimise the rule weights used by HEHRS to combine the predictions of different rules into a class to be assigned to a given test example. In addition, we propose a variant of a method to mitigate the aforementioned drawback of top-down classification. These three types of methods are compared against the standard top-down hierarchical classification method in six challenging bioinformatics datasets, involving the prediction of protein function. Overall HEHRS with the rule weights optimised by the PSO algorithm obtains the best predictive accuracy out of the four types of hierarchical classification method

Item Type: Article
Uncontrolled keywords: classification, data mining, particle swarm optimisation
Subjects: Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming,
Divisions: Faculties > Science Technology and Medical Studies > School of Computing > Applied and Interdisciplinary Informatics Group
Depositing User: Mark Wheadon
Date Deposited: 29 Mar 2010 12:12
Last Modified: 19 May 2014 15:55
Resource URI: http://kar.kent.ac.uk/id/eprint/24048 (The current URI for this page, for reference purposes)
  • Depositors only (login required):