Big Data

Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms

Date Added: Jan 2011
Format: PDF

In this paper, the authors focus on using a cluster of SMPs for scalable data mining. They have developed distributed memory and shared memory parallelization techniques that are applicable to a number of common data mining algorithms. These techniques are incorporated in a middleware called FREERIDE (FRamework for Rapid Implementations of Datamining Engines). The authors present experimental evaluation of their techniques and framework using apriori association mining, k-means clustering, and a decision tree algorithm.