Re-Identification and Synthetic Data Generators: A Case Study

Download Now
Provided by: Universitat Rostock
Topic: Data Management
Format: PDF
Synthetic generators are increasingly used to replace sensitive data with artificial data preserving to a predetermined extent the utility of the original data. When using synthetic data generators, re-identification analysis is usually disregarded on the grounds that, the released data being artificial, no real re-identification is possible. While this may be reasonable if synthetic generation is performed on the confidential outcome attributes, it is an unrealistic assumption if synthetic data generation is performed on the quasi-identifier attributes. In the latter case, re-identification can indeed happen if a snooper is able to link an external identified data source with some record in the released dataset using the quasi-identifier attributes: coming up with a correct pair (identifier, confidential attributes) is indeed a re-identification.
Download Now

Find By Topic