Privacy Preserving Categorical Data Analysis With Unknown Distortion Parameters

Date Added: Nov 2009
Format: PDF

Randomized Response techniques have been investigated in privacy preserving categorical data analysis. However, the released distortion parameters can be exploited by attackers to breach privacy. In this paper, the authors investigate whether data mining or statistical analysis tasks can still be conducted on randomized data when distortion parameters are not disclosed to data miners. They first examine how various objective association measures between two variables may be affected by randomization. They then extend to multiple variables by examining the feasibility of hierarchical log-linear modeling. Finally they show some classic data mining tasks that cannot be applied on the randomized data directly.