General discussion


Information in non-production databases

By rparthas ·
Most companies that own a database, make copies (cloning) of the database for development, QA, testing, patch testing, etc. These databases are distributed to programmers, contractors, sub-contractors, outside vendors and at time off shore vendors.

The cloned copy of the database contains all the information in the 'production' system -- depending on the nature of the database it can contain sensitive personal (salary, SSN, address, age, etc) as well as corporate (pricing information, supplier, financial data, etc).

With the extra-ordinary levels of legislation/ standards around privacy, this area should be one that gets a lot of attention.

How do organizations protect the information in these databases?

This conversation is currently closed to new comments.

Thread display: Collapse - | Expand +

All Comments

Collapse -


by JamesRL In reply to Information in non-produc ...

This kind of data should never be shared. I can tell you that if and when customers find out, its a potential public relations nightmare, and will have a negative impact on the bottom line.

I have heard of tools that will change the data so that it no longer has the personal information in it. Better yet, there are tools for creating test data - thats what should be used.


Collapse -

Structure and Masked Data only?

by Paymeister In reply to Information in non-produc ...

Good question - the Sarbanes-Oxley Act certainly has my attention, and this is an area I hadn't though much about.

One can understand the corporate info or salaries, and SSNs for tax computation... but SSNs should certainly not be used for ID purposes (for legal and data integrity reasons). If you have "employee ID numbers" one doesn't have to worry about confidentiality.

If the development tricks work with ten records, they'll work with the full set: we delete all but a dozen records, and substitute bogus data for the real stuff to make a "testing" version. This would be rough for a database with lots of interconnections, but may be worth considering for less complex data.

Collapse -

Good idea

by rparthas In reply to Structure and Masked Data ...

Paymeister -- You are thinking on the right track, and your idea is very good. Where your idea will need to be extended into a bigger approach is when you have a complicated relational database where the information is stored in a number of places and referenced thorought. Also, when such a method is developed internally, what happens to independence -- the person who built the method has too much information AND the mechanism to reverse engineer.

Collapse -


by Snow rabbit In reply to Information in non-produc ...

Based on my experience with commercial and government organizations:

Depending on your objectives you have a few alternatives:

(1) you can distribute databases with specially designed (minimal) testdata.
Though theoretically correct this may not be your most practical choice

so the next alternative is

(2) make a copy of an existing database and make *SURE* you scramble the sensitive data in such a way that it is impossible to reverse the scrambling process while making sure that the result is still usable. Make *SURE* that you distribute the right database version.
This choice is a good one when you need several reference tables and/or a lot of data - e.g. for performance or stress tests.

Collapse -
by saghir_taj In reply to Information in non-produc ...

DBNest:The Nest of DB Professionals is a nice source and great additon databasing websites.

Collapse -
by saghir_taj In reply to Information in non-produc ...

DBNest:The Nest of DB Professionals is a nice source and great additon databasing websites.

Related Discussions

Related Forums