Privacy Preserving Data Mining With Reduced Communication Overhead
In today's world privacy and security are more essential elements when data is shared. A fruitful direction for future data mining research will be the development of techniques that incorporate privacy concerns. Most of the methods use random permutation techniques to mask the data, for preserving the privacy of sensitive data. The approaches for privacy preserving data mining suffer from high communication and computation overheads. A distributed scenario is considered in this work, where data is partitioned vertically over multiple databases. The proposed algorithm uses standard k-means clustering algorithm where the attributes of an entity are present in different party's databases.