Lavington, S.H. and Dewhurst, N. and Wilkins, E. and Freitas, A.A. (1999) Interfacing knowledge discovery algorithms to large database management systems. Information and Software Technology - special issue on data mining, 41 (9). pp. 605-617. ISSN 0950-5849.
|The full text of this publication is not available from this repository. (Contact us about this Publication)|
The efficient mining of large, commercially credible, databases requires a solution to at least two problems: (a) better integration between existing Knowledge Discovery algorithms and popular DBMS; (b) ability to exploit opportunities for computational speedup such as data parallelism. Both problems need to be addressed in a generic manner, since the stated requirements of end-users cover a range of data mining paradigms, DBMS, and (parallel) platforms. In this paper we present a family of generic, set-based, primitive operations for Knowledge Discovery in Databases (KDD). We show how a number of well-known KDD classification metrics, drawn from paradigms such as Bayesian classifiers, Rule-Induction/Decision Tree algorithms, Instance-Based Learning methods, and Genetic Programming, can all be computed via our generic primitives. We then show how these primitives may be mapped into SQL and, where appropriate, optimised for good performance in respect of practical factors such as client-server communication overheads. We demonstrate how our primitives can support C4.5, a widely-used rule induction system. Performance evaluation figures are presented for commercially available parallel platforms, such as the IBM SP/2.
|Uncontrolled keywords:||data mining; parallelism; KDD primitives; decision trees; client-server|
|Subjects:||Q Science > QA Mathematics (inc Computing science) > QA 76 Software, computer programming,|
|Divisions:||Faculties > Science Technology and Medical Studies > School of Computing > Applied and Interdisciplinary Informatics Group|
|Depositing User:||Mark Wheadon|
|Date Deposited:||02 Sep 2009 12:09|
|Last Modified:||01 Aug 2012 08:17|
|Resource URI:||http://kar.kent.ac.uk/id/eprint/21713 (The current URI for this page, for reference purposes)|
- Depositors only (login required):