Establishing a Benchmark for Re-Identification Methods and Its Validation Using Fuzzy Clustering
Privacy preserving data mining and statistical disclosure control are related fields with increasing importance nowadays. They aim is to allow the publication of sensible data without compromising the privacy of data respondents. To that end, masking methods have been designed so that data are distorted in a way that preserves confidentiality and data utility. Alternatively, methods have been constructed to generate synthetic data that have properties similar to the ones of the original data. At the same time, recent research in re-identification methods (record and variable matching) has been pushed forward due to the current interest on security issues and the huge amount of data stored in databases.