Big Data

A Grid-Based Distributed SVM Data Mining Algorithm

Date Added: Feb 2009
Format: PDF

Distribution of data and manipulation allows for solving larger problems and executing applications that are distributed in nature. In this paper, the authors present a grid-based distributed Support Vector Machine (SVM) algorithm. The Grid is a distributed computing infrastructure that enables coordinated resource sharing within dynamic organizations consisting of individuals, in situations and resources. Grid environments can be used both for compute intensive tasks and data intensive applications as they offer resources, services, and data access mechanisms. Data mining algorithms and knowledge discovery processes are both compute and data intensive; therefore the Grid can offer a computing and data management infrastructure for supporting decentralized and parallel data analysis. The SVM algorithm is implemented in C and MPI.