Over the years, various attempts have been made to anonymize databases so Personally Identifiable Information (PII) is unavailable to those who use the databases for research or business. The success of anonymizing techniques appears to be less than adequate.
In a 2009 TechRepublic column Electronic databases: What’s new with privacy concerns, Paul Ohm, associate professor of law at University of Colorado Law School, gets right to the problem, saying, “Data can either be useful or perfectly anonymous but never both.”
Ohm is not alone; several additional TechRepublic columns provide comments from experts who believe anonymized data is anything but anonymous.
- Is metadata collected by the government a threat to your privacy?
- Google changes data retention policy for Google search
- Big data will enhance healthcare, but to whose benefit?
The hope is that, since today’s big data and analysis technologies are powerful enough to solve critical global-scale problems, data scientists and technologists will discover a method for real anonymization. However, without finding a way to keep PII truly anonymous, what happens to all the people whose sensitive personal information becomes compromised because of a data breach?
SEE: The 18 most frightening data breaches (TechRepublic)
A new anonymizing approach
Researchers at Radboud University in The Netherlands may have found a solution for managing sensitive personal data. In their paper, Polymorphic Encryption and Pseudonymisation for Personalised Healthcare (PDF), Eric Verheul, Bart Jacobs, Carlo Meijer, Mireille Hildebrandt, and Joeri de Ruiter explain one way of ensuring confidentiality, integrity, and authenticity, as well as availability of data.
SEE: Encryption Policy (Tech Pro Research)
As the title of their paper suggests, the researchers focus on healthcare databases. This is definitely a good thing, as EHR databases are now popular targets for cybercriminals due to the amount of data available in one location, as well as the fact that data–unlike financial information–cannot be changed or canceled.
The paper’s authors explain that the PEP framework consists of two components: polymorphic encryption and polymorphic pseudonymisation. The researchers begin with polymorphic encryption by explaining how it differs from more traditional encryption processes: “In traditional encryption, one encrypts for some chosen recipient who then holds the decryption key; whereas in polymorphic encryption one encrypts in a general manner and at a later time the encryption can be transcribed to multiple recipients with different keys.”
The authors then outline the polymorphic encryption process:
- Directly after generation, data can be encrypted in a “polymorphic” manner and stored at a (cloud) storage facility in such a way that the storage provider cannot get access. Crucially, there is no need to a priori fix who gets to see the data, so that the data can immediately be protected. For instance, a PEP-enabled self-measurement device will store all its measurement data in polymorphically encrypted form in a back-end database.
- Later it can be decided who can decrypt the data. This decision will be made on the basis of a policy in which the data subject should play a key role. The user of the PEP-enabled device can, for instance, decide that doctors X, Y, and Z may at some stage decrypt and use the data in their diagnosis; or medical researcher groups A, B, and C may use it for their investigations; or third parties U, V, and W may use it for additional services.
- This “tweaking” of the encrypted data to make it decryptable by a specific party can be done in a blind manner. It will have to be done by a trusted party who knows how to tweak the cipher text for those to be granted access.
Something that’s unique to the PEP framework: Users remain in control of their data; they can monitor where the data is being used, who is using it, and for what purpose. As to why that is important, in this Radboud University press release, Bart Jacobs, professor of digital security at Radboud University notes, “In the context of international medical research, personal information is worth its weight in gold. So it’s important for the government to invest in an infrastructure that guarantees the protection of this information. Especially to ensure that people will remain willing to participate in future studies of this sort.”
If cybercriminals steal a database protected by a PEP framework, it appears they will have wasted their time. The researchers offer more good news: The PEP approach can be applied to other applications, such as handling sensor or surveillance data from a swarm of IoT devices.