Big Data

Communication-Efficient Privacy-Preserving Clustering

Date Added: Mar 2010
Format: PDF

The ability to store vast quantities of data and the emergence of high speed networking have led to intense interest in distributed data mining. However, privacy concerns, as well as regulations, often prevent the sharing of data between multiple parties. Privacy-preserving distributed data mining allows the cooperative computation of data mining algorithms without requiring the participating organizations to reveal their individual data items to each other. This paper makes several contributions, the authors present a simple, deterministic, I/O-efficient k-clustering algorithm that was designed with the goal of enabling an efficient privacy-preserving version of the algorithm.