With Big Data in the crosshairs of IT projects, there’s now an understanding that burgeoning stockpiles of structured and semi-structured data need to be sorted and cleaned before they’re ready to be run through business analytics or high performance computing (HPC).

Some of this work process can be addressed with data deduplication and cleansing tools, but it’s equally important for IT to revisit its information governance policies to ensure early in the game that all of this effort is on track.

For instance, the insurance industry was mandated by the National Association of Insurance Commissioners (NAIC) to adopt the Model Audit Rule (M.A.R.) beginning in 2010. The rule was a response to the many corporate scandals that had involved insurance, and it was intended to improve transparency and to tighten internal controls. Predictably, the new mandate also impacted data policies, including policies for big data.

Getting Big Data ready to meet a new regulatory requirement is not something that tools and automation alone can solve. Instead, key players from both IT and the business side of the organization have to get together to determine which big data should be stored and collected, and how the individual pieces of this data should be defined.

Potentially, IT can then go away and assess (with the help of tools) how complete this data is, and whether the data is of high quality. However, even if IT does this on its own, the ultimate sign-off must come from the users in the end business.

Once it is determined as to which big data should be collected and stored and the data is cleaned and prepared, there must be ongoing processes in the form of “living document” policies, standards and procedures that govern all big data assets.

Key players throughout the enterprise-from C-level executives, to business unit managers to IT, must be in agreement with these policies. Certainly, industry regulators will expect no less-and can be expected to interview all of these players, in addition to checking published policies and procedures to assure that these are in compliance.

Finally, and significantly for IT, the big data that is identified for collection and storage must highly align with the information requirements of the end business. For this to happen, the CIO and others in IT must actively team with their counterparts in various business units throughout the enterprise to assure that everyone embraces the same set of big data policies and procedures-and the types of big data hat are being stored and collected, along with their end business purposes.

All of this is tough, people-intensive work that doesn’t always show up in project timelines, but should.

What can IT do to ensure that its governance practices are up to speed for big data?

  • Coordinate with regulators and auditors in advance to ensure that your governance practices for big data are up to date-and to also get a sense of coming changes in regulations that could impact data governance.
  • Make big data projects interdisciplinary-because effective governance for big data is everybody’s responsibility.
  • Ensure that IT has governance know-how on board-both in IT’ers who are comfortable working with their counterparts on the business side of governance, and by acquiring appropriate toolsets for big data that can prepare the data in ways that meet governance standards.