Four best practices for securing big data

Presenting sound security as basic stewardship of high-value company assets is a reasoned approach that everyone can get behind.

Big data promises speed, breadth, and access to a wide variety of internal and external data sources, criteria that are often anathema to information security policies and practices. Furthermore, with big data analysis still in its infancy, many companies entrust their data stores to a variety of third parties, from technical specialists to data scientists. Here are a few suggestions for keeping your big data initiatives secure.

Data equals dollars

As you write large checks for the various software, hardware, and services associated with big data initiatives, it's easy to think of the data itself as one of the least valuable aspects of the process. After all, you already "own" it. Data are merely bits and bytes that are easily transferred versus fancy database applications running on expensive hardware used by data scientists commanding six-figure salaries. However, data are usually the most valuable aspects of a big data project. These data may contain proprietary insights into your customer base or an intimate look at the financial health of your company.

You probably wouldn't leave that expensive new disk enclosure powering your analytics server sitting unattended outside your building, so make sure you're not doing the same with your data, by crafting policies and procedures that protect and safeguard this valuable asset.

Vet your vendors

With even the best data protections in place, if your vendors haphazardly exchange sensitive data with other third parties, or mismanage this expensive asset, your protections are for naught. Rather than trusting verbal assurances and handshakes, insist that vendors agree to standards through associated penalties for violating those standards. Ask to see things like training manuals and data security compliance statistics before handing over sensitive data. If you're in a rush or unable to vet a vendor appropriately, create sample data that contain the same fields as the "real" dataset, and a variety of data that are realistic but do not contain actual data or identifiers. These can be used to create and vet the analytical model, which can then be run against the true data in more controlled circumstances.

Tech Pro Research: Big Data Primer for IT Pros

Outcomes are just as important

Just as important as the raw data that feed your analytics engine are the outcomes. At its best, big data provides actionable decisions or information that no other company has. You might identify a market or environmental factor that can give you an edge on your competition or catch a glimmer of future market conditions. Ensure that the outcomes of your big data initiative are protected and tracked just as carefully as the raw data that created them.

Speed kills

With the power of big data becoming evident, the processes and tools surrounding big data have been built for speed rather than security. For a fledgling big data initiative, it might be tempting to fire up the latest open source analytic engines, load up a few tables of financial or customer data, and see what comes out before worrying about trivialities like data security, or even proper security and basic access management.

Put some basic policies and procedures in place around these new tools, and think through on who will have access to which data and what the financial impact will be if those data are broadly disclosed.

Bottom line

With big data security, there's no need to spread fear and panic, but presenting sound security as basic stewardship of high-value company assets is a reasoned approach that everyone can get behind, even among cries that big data should have been deployed yesterday.

Also read: