Evaluating six candidate solutions for the small-disjunct problem and choosing the best solution via meta-learning

Carvalho, Deborah R., Freitas, Alex A. (2005) Evaluating six candidate solutions for the small-disjunct problem and choosing the best solution via meta-learning. Artificial Intelligence Review, 24 (1). pp. 61-98. ISSN 0269-2821. (doi:10.1007/s10462-005-1586-7) (The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided) (KAR id:14257)

The full text of this publication is not currently available from this repository. You may be able to access a copy if URLs are provided.
Official URL: https://doi.org/10.1007/s10462-005-1586-7

Abstract

A set of classification rules can be considered as a disjunction of rules, where each rule is a disjunct. A small disjunct is a rule covering a small number of examples. Small disjuncts are a serious problem for effective classification, because the small number of examples satisfying these rules makes their prediction unreliable and error-prone. This paper offers two main contributions to the research on small disjuncts. First, it investigates six candidate solutions (algorithms) for the problem of small disjuncts. Second, it reports the results of a meta-learning experiment, which produced meta-rules predicting which algorithm will tend to perform best for a given data set. The algorithms investigated in this paper belong to different machine learning paradigms and their hybrid combinations, as follows: two versions of a decision-tree (DT) induction algorithm; two versions of a hybrid DT/genetic algorithm (GA) method; one GA; one hybrid DT/instance-based learning (IBL) algorithm. Experiments with 22 data sets evaluated both the predictive accuracy and the simplicity of the discovered rule sets, with the following conclusions. If one wants to maximize predictive accuracy only, then the hybrid DT/IBL seems to be the best choice. On the other hand, if one wants to maximize both predictive accuracy and rule set simplicity - which is important in the context of data mining - then a hybrid DT/GA seems to be the best choice.

Item Type:	Article
DOI/Identification number:	10.1007/s10462-005-1586-7
Uncontrolled keywords:	data mining, classification, evolutionary algorithms, decision tree, meta-learning
Subjects:	Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming,
Divisions:	Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User:	Mark Wheadon
Date Deposited:	24 Nov 2008 18:02 UTC
Last Modified:	09 Mar 2023 11:30 UTC
Resource URI:	https://kar.kent.ac.uk/id/eprint/14257 (The current URI for this page, for reference purposes)

University of Kent Author Information

Freitas, Alex A..

Creator's ORCID:	https://orcid.org/0000-0001-9825-4700
CReDIT Contributor Roles:

Depositors only (login required):

Altmetric

Download Statistics

Total unique views for this document in KAR since July 2020. For more details click on the image.