Big Data

Defining an ROI for Big Data

Efficient resource utilization is the area of Big Data ROI that the CIO needs to bring to the budget discussion.

CIOs know that no matter how great the promise of any new technology, that technology has to be "sold" at the budget table to pave the way for implementation. The selling process is highly complex. It begins with stating the cost of hardware and software in dollars and cents, and then adds to this any staff resources or special skills that must be brought onboard. From here, the discussion gets into the ultimate benefits of the technology for the business-whether they are time to market, greater revenue opportunities or saved costs.

This is the backdrop for IT decision-makers when they read reports from top-level research companies like McKinsey, which state that big data "will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus." MacKinsey projects that a retailer using Big Data could increase its operating margin by more than 60 percent; that U.S. healthcare can drive efficiencies and quality with big data analysis to the tune of $300 billion in value every year, reducing expenditure by about eight percent; and that government administrators in developed European economies could save more than $149 billion in operational efficiencies by using Big Data.

But will this be enough for CIOs and business leaders to bring big data technology through the door?

The CFO will likely be the first person to ask for a projected return on investment. The irony the CIO faces is that he may not have any existing ROI models in IT that he can draw from! This is because traditional IT ROI models are based on elements like speed per transaction (Big Data doesn't work on a speed-per-transaction basis), shrinking data center equipment footprints and gaining energy savings (Big Data does not run on virtualized machines, which are pivotal to shrinking data center footprints and gaining energy savings). To compound the situation, Big Data for applications like automobile performance simulations or modeling new drug formulations can sometimes take hours to run. These apps can't usually qualify for the more compact and inexpensive processing options of simple business analytics computing.

This begins to conjure up multi-million-dollar images of supercomputers, certain to make every CFO's and CIO's hair turn gray.

Fortunately, there are a growing number of scalable, clustered high performance computing solutions out there that make Big Data analysis a reasonable option for enterprises; See also These solutions have the ability to "start small" and then be expanded as enterprise big data analysis needs grow. They can also be scaled to the level of supercomputers if they ever need to be. This can be enough to pass the CFO's litmus test on cost of acquisition.

Of course, there is still more to do.

The team proposing big data technology must also show how the technology is going to bring value to the enterprise-and how long it will take the company to recoup its technology investment.

Value, as McKinley states, will come in the form of faster times to market from Big Data analysis that give the enterprise a competitive edge, or superior ways to evaluate and respond to consumer buying patterns that enable the company to capture more revenue. In some cases, Big Data analysis (think healthcare) can provide insights that allow organizations to revamp operations for less waste, thereby reducing costs. These savings or earnings projections are usually penciled out by line of business managers at the budget table.

That leaves ROI from the data center, which can be a challenge for CIOs. Remember: high performance computing for Big Data is not virtualized, so it is not likely to contribute return on investment in saved data center floor space or energy savings.

Instead, the CIO should look at energy consumption and data center efficiencies from the standpoint of server utilization. In a traditional transaction processing environment, server utilization generally hovers around 40-60 percent for Intel servers running virtual applications and operating systems, and around 80 percent for a mainframe. The wasted utilization occurs because transaction servers must often wait for transaction requests to come in. In contrast, servers that function in HPC clusters for Big Data processing contain different processing nodes, with each node processing a single thread of the data, and with no interruptions or wait times. These nodes operate in parallel. It is only at the end of the processing that all Big Data threads are brought back together for a composite Big Data analysis. Because of this parallel processing, Big Data HPC servers usually run at 90-95 percent utilization, with almost no idle time.

Efficient resource utilization is the area of Big Data ROI that the CIO needs to bring to the budget discussion. If this resource utilization argument is combined with the time to market, operational efficiencies and revenue optimization arguments from the end business, the CFO will feel a lot more comfortable about the investment.


Mary E. Shacklett is president of Transworld Data, a technology research and market development firm. Prior to founding the company, Mary was Senior Vice President of Marketing and Technology at TCCU, Inc., a financial services firm; Vice President o...


Editor's Picks