Skip to main content
Kent Academic Repository

Hierarchical classification of protein function with ensembles of rules and particle swarm optimisation

Holden, Nicholas, Freitas, Alex A. (2009) Hierarchical classification of protein function with ensembles of rules and particle swarm optimisation. Soft Computing, 13 (3). pp. 259-272. ISSN 1432-7643. (doi:10.1007/s00500-008-0321-0) (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:24048)

The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided.
Official URL:
http://dx.doi.org/10.1007/s00500-008-0321-0

Abstract

This paper focuses on hierarchical classification problems where the classes to be predicted are organized in the form of a tree. The standard top-down divide and conquer approach for hierarchical classification consists of building a hierarchy of classifiers where a classifier is built for each internal (non-leaf) node in the class tree. Each classifier discriminates only between its child classes. After the tree of classifiers is built, the system uses them to classify test examples one class level at a time, so that when the example is assigned a class at a given level, only the child classes need to be considered at the next level. This approach has the drawback that, if a test example is misclassified at a certain class level, it will be misclassified at deeper levels too. In this paper we propose hierarchical classification methods to mitigate this drawback. More precisely, we propose a method called hierarchical ensemble of hierarchical rule sets (HEHRS), where different ensembles are built at different levels in the class tree and each ensemble consists of different rule sets built from training examples at different levels of the class tree. We also use a particle swarm optimisation (PSO) algorithm to optimise the rule weights used by HEHRS to combine the predictions of different rules into a class to be assigned to a given test example. In addition, we propose a variant of a method to mitigate the aforementioned drawback of top-down classification. These three types of methods are compared against the standard top-down hierarchical classification method in six challenging bioinformatics datasets, involving the prediction of protein function. Overall HEHRS with the rule weights optimised by the PSO algorithm obtains the best predictive accuracy out of the four types of hierarchical classification method

Item Type: Article
DOI/Identification number: 10.1007/s00500-008-0321-0
Uncontrolled keywords: classification, data mining, particle swarm optimisation
Subjects: Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming,
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Mark Wheadon
Date Deposited: 29 Mar 2010 12:12 UTC
Last Modified: 05 Nov 2024 10:03 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/24048 (The current URI for this page, for reference purposes)

University of Kent Author Information

  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.