A Survey of Parallel Data Mining.
In: Proc 2nd Int Conf on the Practical Applications of Knowledge Discovery and Data Mining.
With the fast, continuous increase in the number and size of databases, parallel data mining
is a natural and cost-effective approach to tackle the problem of scalability in data mining.
Recently there has been a considerable research on parallel data mining. However, most
projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This
paper surveys parallel data mining with a broader perspective. More precisely, we discuss the
parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule
induction, instance-based learning, genetic algorithms and neural networks. Using the lessons
learned from this discussion, we also derive a set of heuristic principles for designing efficient
parallel data mining algorithms.
- Depositors only (login required):