Big Data

10 roadblocks to implementing Big Data analytics

Before you jump on the Big Data bandwagon, make sure you understand exactly what you're getting into.

Big Data and business analytics are two of the most exciting areas in business and IT these days -- but for most enterprises, they are still developmental. Although the opportunities are boundless, the road to an effective Big Data operation is fraught with challenges. Here are some of the obstacles companies are encountering -- and some ways to get around them.

1: Budget

Traditional servers in enterprise data centers are not designed for processing Big Data. Minimally, analytics servers, and in some case high performance computing (HPC) servers and applications, will be needed. This will require new IT investment. The key to success here for the CIO is to build a business case in plain English so that others in the organization (like the CFO) can understand why servers already installed in the data center can't be repurposed to work with Big Data. The CIO should have this understanding (and buy-in) in place before making any IT investment.

2: IT know-how

Big Data doesn't process like online transactional data does -- and it requires a different strategy for both storage and processing. Big Data processors run several processing threads in parallel as they work the data. They do not proceed sequentially, as they do when they're processing online transactions.

Storage strategy must also change. This starts with a tiering of storage that places the most sought-after data on faster storage devices, such as cache/solid state disk, and less frequently accessed data on slower hard disks. There are turnkey automated storage tiering solutions on the market. But ultimately, many IT departments want to formulate their own rules for how Big Data is prioritized and accessed. This requires a level of strategic expertise from storage professionals that IT departments haven't demanded before. CIOs can prepare their storage staff for a heightened role by ensuring that they are included in IT strategic planning meetings -- and that they have the latest in storage management training.

3: Business know-how

Business analytics and Big Data vendors are eager to knock on your door with turnkey reports and easy ways to get started with Big Data -- but all too often, the tendency of end business users is to ask that the top 10 to 20 reports they've been using for the past 15 years get converted to the new solution first. This isn't a good way to use Big Data -- or to help the company get closer to answering tough business questions that have eluded it in the past. Knowing how to query Big Data to answer the big questions is also where present skills fall short in businesses. One way to grow this skills area is to contract with the vendor (which usually has Big Data trainers and specialists on staff) to provide Big Data/business analytics training to end users as part of the solution implementation process.

4: Data cleanup

Big Data and business analytics are only as good as the data itself. This is why cleaning up data to ensure that incomplete, inaccurate, and duplicate data is removed should be the first step of any Big Data project. The CIO must explain this and secure top management's support for a Big Data cleanup, which will seem to those on the outside as a lot of effort expended for no tangible results. The best approach to selling the process is to present the facts upfront so there are no surprises.

5: The storage bulge

The amount of data under management in enterprises has grown five times over the past four years. And while this has happened, we have gotten no better at managing data. If enterprises are going to harvest the kernels of wisdom buried in Big Data, they are first going to have to find ways to unravel it. This begins by sorting through the data, deciding what is important, and either archiving or getting rid of the rest.

6: New data center workloads

Enterprise data centers are organized around online transaction processing, which functions at priority one. Batch processing is run at night or at low priority during the day. With business analytics and Big Data, there is now a call to run real-time analytics at high priority so that retailers can analyze and respond to who is buying what at the same time the buying activity is taking place. This means that data center operations have to change so they also reflect these new priorities.

The best way to effect the transition is to get your IT staff engaged in analyzing the workloads currently run through the data center to determine how they will likely change and which area of IT daily operations will also have to change. The sooner you begin this process, the sooner your data center will be positioned for Big Data -- and the less uncertainty your staff will experience.

7: Data retention

One of the major causes of data accumulation in organizations is a fear of permanently losing that data. E-discovery law is a good example. Your company might be subpoenaed for every email ever written for the last 10 years. And regulators might require that you keep your data for many years. Nevertheless, it's still IT's job to assume a lead role in working with different end-user departments to set data retention policy and to determine what happens to data (archiving or elimination) at the end of retention timeframes. This somewhat clerical task is often at the bottom of IT's to-do list. But for Big Data management, data retention policies (and enforcement) need to be toward the front of the list.

8: Vendor role clarification

Because so many organizations are inexperienced with Big Data, many vendors offer turnkey solutions complete with prefab analytics reports. These reports are great for getting started, but you'll want to start developing your own internal expertise. Make sure that your vendor understands this so it can be a strong business partner.

9: Business and IT alignment

Business goals and IT Big Data strategy should be tightly aligned before any IT investments are made. Ultimately, C-level executives are going to look back to see whether they really were able to answer the big questions and gain competitive advantage for the company through the use of Big Data. Do you want to predict when there will be a disruption to a supply chain so you can rearrange your logistics to still be on time with your order fulfillment? Is it important to know when a certain buying trend first emerges so you can be first to market? Know what you are going after before you invest in Big Data and analytics.

10: Developing new talent

Everyone is scrambling to find people who can run Big Data and business analytics. Who are these people? Some are statistical engineers who are trained in logic, mathematics, computer science, and complex problem-solving involving huge amounts of data. These data engineers and analysts are not the people currently performing analysis and programming on your IT staff. That's why many companies initiating Big Data and business analytics projects either hire consultants to train their people or look to hire data engineers. Data engineers are in high demand and they are very expensive. For CIOs, fostering an aggressive training program for internal staff members whom you believe you can develop into Big Data specialists might be the best bet.

Additional resources

About

Mary E. Shacklett is president of Transworld Data, a technology research and market development firm. Prior to founding the company, Mary was Senior Vice President of Marketing and Technology at TCCU, Inc., a financial services firm; Vice President o...

2 comments
JohnM_Haddad
JohnM_Haddad

One of the places companies make a mistake with Big Data is they assume it’s only about “new” data. It’s not. It’s also about the volume growth of traditional transactional data, which according to one survey, is growing 50-60% a year. This is making transactional applications ans analytics infrastructure fall down, and requiring significant hardware upgrades to keep up with the pace. It’s also about trying to take advantage of the variety of new data, which is growing 3-4x a year. Those companies who realize this are working through these roadblocks. With regards to budget, they’re changing their architecture to avoid the expensive hardware upgrades. They’re offloading processing of source systems and data warehouses by moving to real-time data integration technologies running on commodity hardware (see the webinar recording “Tackling Big Data Using Informatica PowerCenter Grid” at http://vip.informatica.com/cathertoninformaticacom7562?elqPURLPage=10297. They’re also addressing the challenges associated with IT know how and the storage buldge by analyzing data more carefully, archiving what they’re not using and using a tiered storage approach. This is freeing up a lot of cash that can be used for new investments related to big data projects. All these savings helps them invest in handling the variety of data and innovations leading to new revenue generating data products and services. Data integration and data cleanup is often 80% of the work involved with Big Data Analytics so organizations are investing in no-code visual development environments to build these data flows which also enables them to utilize more readily available resources like ETL developers.

mjc5
mjc5

In a day when there is plenty of processing power, plenty of storage, and plenty of people who know how to make it all work, that we are returning to an old, failed mode of operation. I remember it was probably around 1980, my spouses boss showing us his computer system. He was going to sell time on it to other people. Didn't work out well. THe only reason Big data and the cloud exist is to eliminate a lot of IT jobs.