In February, 2012, industry researcher Gartner issued a report that predicted several changes that were imminent in IT infrastructure with the coming of big data.
Two important observations that Gartner made at the time were that:
- Big data was not simply big volume; it also involved variety, velocity and complexity.
- There would be significant changes to IT infrastructure and to how IT allocated its budget.
Prognostications like these are always food for thought for CIOs. However, now that IT has had its feet on the ground with big data for another year, there is an additional layer of expertise gained from the trenches that can further sharpen the thinking on just what will be needed in order to develop a mature big data operation. Here is what we are seeing:
Higher expectations in business analytics that use big data will drive organizations to high performance computing (HPC).
Business analytics didn’t start out with the assumption that HPC would be needed. IT (and vendors) thought that analytics could be executed with specialized servers working with relational databases. But as more companies pressed for integrating unstructured and semi-structured big data from outside of the enterprise (e.g., Internet), and as they wanted to query the data with questions that were increasingly complex, some hit a wall with first-tier analytics. Even SMBs (small and medium sized businesses) are feeling the pinch. Some are now finding themselves looking for HPC that might be delivered via affordable cloud-based solutions. Why? To get what they want from their big data. Enterprises with greater IT resource pools are beginning to opt for data center-resident x86-based cluster computers that are specially constructed for HPC. All of this will be to crunch through their big data.
Now that big data is here, a body of knowledge must be developed so IT knows how to manage it.
IT is not the primary user of big data in most of the businesses beginning to exploit it-but very quickly, companies have determined that IT is the best department to manage both big data and HPC. One area where C-level executives expect fast results from IT is in knowing how to get the most out of the company’s information assets. It’s great to be able to plumb the depths of big data, and to use advanced processing like HPC that can handle complex queries, but a new set of performance metrics also needs to be defined and managed to. To develop and manage to these metrics, IT (and end users) have to understand the types of big data that they want to analyze, and how often (or how quickly) this data must be accessed.
IT’s greatest role with big data and HPC might be as custodian and mediator
Organizations are going to achieve success with big data analytics and with technologies like HPC. When this happens, more departments will want to utilize these assets. One key to success with HPC and big data is knowing how to schedule different jobs that run concurrently. This means that someone (probably IT!) will need to sit down with all of the big data and HPC users, set up run schedules-and possibly mediate priorities. This requires diplomacy, negotiation and people skills. It also depends on someone (probably IT!) sitting down with senior executives and line managers to establish and agree on what the corporate information priorities are.
IT infrastructure and budgeting will change
Big data and business analytics are still in early evolutionary stages. As time passes, more businesses will expect these analytics to operate in real time-right alongside transaction processing. Standard rules of thumb that today prioritize transaction processing first and stick all of the report generation into a background or nightly batch process won’t work anymore. This is going to require a total “rethink” of IT infrastructure, operating procedures-and even where IT dollars are spent.