Privacy Preserving Distributed Data Mining using Randomized Site Selection
Distributed data mining explores hidden useful information from data sources distributed among several sites. Privacy of participating sites becomes great concern and sensitive information pertaining to the individual sites needs high protection when data mining occurs among several sites. Different approaches for mining data securely in a distributed environment have been proposed but in the existing approaches, collusion among the participating sites may reveal sensitive information about other participating sites and they suffer from the intended purposes of maintaining privacy of the individual participating sites, reducing computational complexity and minimizing communication overhead. The proposed method finds global frequent itemsets in a distributed environment with minimal communication among sites and ensures higher degree of privacy with Elliptic Curve Cryptography (ECC) and randomized site selection.