A Survey of Parallel Data Mining.
In: Proc 2nd Int Conf on the Practical Applications of Knowledge Discovery and Data Mining.
(Full text available)
With the fast, continuous increase in the number and size of databases, parallel data mining
is a natural and cost-effective approach to tackle the problem of scalability in data mining.
Recently there has been a considerable research on parallel data mining. However, most
projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This
paper surveys parallel data mining with a broader perspective. More precisely, we discuss the
parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule
induction, instance-based learning, genetic algorithms and neural networks. Using the lessons
learned from this discussion, we also derive a set of heuristic principles for designing efficient
parallel data mining algorithms.
- Depositors only (login required):