Before you jump on the Big Data bandwagon, make sure you understand exactly what you're getting into.
Big Data and business analytics are two of the most exciting areas in business and IT these days -- but for most enterprises, they are still developmental. Although the opportunities are boundless, the road to an effective Big Data operation is fraught with challenges. Here are some of the obstacles companies are encountering -- and some ways to get around them.
Traditional servers in enterprise data centers are not designed for processing Big Data. Minimally, analytics servers, and in some case high performance computing (HPC) servers and applications, will be needed. This will require new IT investment. The key to success here for the CIO is to build a business case in plain English so that others in the organization (like the CFO) can understand why servers already installed in the data center can't be repurposed to work with Big Data. The CIO should have this understanding (and buy-in) in place before making any IT investment.
2: IT know-how
Big Data doesn't process like online transactional data does -- and it requires a different strategy for both storage and processing. Big Data processors run several processing threads in parallel as they work the data. They do not proceed sequentially, as they do when they're processing online transactions.
Storage strategy must also change. This starts with a tiering of storage that places the most sought-after data on faster storage devices, such as cache/solid state disk, and less frequently accessed data on slower hard disks. There are turnkey automated storage tiering solutions on the market. But ultimately, many IT departments want to formulate their own rules for how Big Data is prioritized and accessed. This requires a level of strategic expertise from storage professionals that IT departments haven't demanded before. CIOs can prepare their storage staff for a heightened role by ensuring that they are included in IT strategic planning meetings -- and that they have the latest in storage management training.
3: Business know-how
Business analytics and Big Data vendors are eager to knock on your door with turnkey reports and easy ways to get started with Big Data -- but all too often, the tendency of end business users is to ask that the top 10 to 20 reports they've been using for the past 15 years get converted to the new solution first. This isn't a good way to use Big Data -- or to help the company get closer to answering tough business questions that have eluded it in the past. Knowing how to query Big Data to answer the big questions is also where present skills fall short in businesses. One way to grow this skills area is to contract with the vendor (which usually has Big Data trainers and specialists on staff) to provide Big Data/business analytics training to end users as part of the solution implementation process.
4: Data cleanup
Big Data and business analytics are only as good as the data itself. This is why cleaning up data to ensure that incomplete, inaccurate, and duplicate data is removed should be the first step of any Big Data project. The CIO must explain this and secure top management's support for a Big Data cleanup, which will seem to those on the outside as a lot of effort expended for no tangible results. The best approach to selling the process is to present the facts upfront so there are no surprises.
5: The storage bulge
The amount of data under management in enterprises has grown five times over the past four years. And while this has happened, we have gotten no better at managing data. If enterprises are going to harvest the kernels of wisdom buried in Big Data, they are first going to have to find ways to unravel it. This begins by sorting through the data, deciding what is important, and either archiving or getting rid of the rest.
6: New data center workloads
Enterprise data centers are organized around online transaction processing, which functions at priority one. Batch processing is run at night or at low priority during the day. With business analytics and Big Data, there is now a call to run real-time analytics at high priority so that retailers can analyze and respond to who is buying what at the same time the buying activity is taking place. This means that data center operations have to change so they also reflect these new priorities.
The best way to effect the transition is to get your IT staff engaged in analyzing the workloads currently run through the data center to determine how they will likely change and which area of IT daily operations will also have to change. The sooner you begin this process, the sooner your data center will be positioned for Big Data -- and the less uncertainty your staff will experience.
7: Data retention
One of the major causes of data accumulation in organizations is a fear of permanently losing that data. E-discovery law is a good example. Your company might be subpoenaed for every email ever written for the last 10 years. And regulators might require that you keep your data for many years. Nevertheless, it's still IT's job to assume a lead role in working with different end-user departments to set data retention policy and to determine what happens to data (archiving or elimination) at the end of retention timeframes. This somewhat clerical task is often at the bottom of IT's to-do list. But for Big Data management, data retention policies (and enforcement) need to be toward the front of the list.
8: Vendor role clarification
Because so many organizations are inexperienced with Big Data, many vendors offer turnkey solutions complete with prefab analytics reports. These reports are great for getting started, but you'll want to start developing your own internal expertise. Make sure that your vendor understands this so it can be a strong business partner.
9: Business and IT alignment
Business goals and IT Big Data strategy should be tightly aligned before any IT investments are made. Ultimately, C-level executives are going to look back to see whether they really were able to answer the big questions and gain competitive advantage for the company through the use of Big Data. Do you want to predict when there will be a disruption to a supply chain so you can rearrange your logistics to still be on time with your order fulfillment? Is it important to know when a certain buying trend first emerges so you can be first to market? Know what you are going after before you invest in Big Data and analytics.
10: Developing new talent
Everyone is scrambling to find people who can run Big Data and business analytics. Who are these people? Some are statistical engineers who are trained in logic, mathematics, computer science, and complex problem-solving involving huge amounts of data. These data engineers and analysts are not the people currently performing analysis and programming on your IT staff. That's why many companies initiating Big Data and business analytics projects either hire consultants to train their people or look to hire data engineers. Data engineers are in high demand and they are very expensive. For CIOs, fostering an aggressive training program for internal staff members whom you believe you can develop into Big Data specialists might be the best bet.
- Big Data, big problems?
- Tiered storage makes Big Data run in production environments
- Big Data analysts: Do you hire or train for it?
- Is your Big Data strategy Big Data capable?
- Big Data's neglected topic: How to secure it