Date Added: Jan 2012
Classification is a data mining technique widely used in critical domains like financial risk analysis, biology, communication network management, etc. Classification accuracy and learning from distributed datasets are the most challenging topics in the field of supervised learning. In this paper, the authors first briefly review the background of parallel and distributed classification algorithms and then propose a novel approach for classification in distributed large datasets. This approach is based on code migration instead of data migration. Extensive experimental results using a popular benchmark test suite show the effectiveness of this approach in term of accuracy. These results show also that the proposed method improved slightly classification accuracy over standard methods.