
A major pharmaceutical manufacturer used advanced analytics to significantly increase its yield in vaccine production without incurring additional capital expenditures, according to a 2014 McKinsey & Company article. By using statistical analytics, the pharma project team targeted the nine most influential process parameters during manufacturing, made process changes, and increased vaccine yield by more than 50% for an annual savings of between $5 million and $10 million.
These are the kinds of big data and analytics results that the corporate C-suite expects to see, now that the honeymoon days of big data experimentation are over, and companies are deploying big data apps in production.
“As companies move their big data apps to production, they find that big data systems are subject to many of the same rules that established transactional data systems are,” said Ash Munshi, CEO of Pepperdata, a big data performance toolset provider. “Big data systems must be reliable and scalable to accommodate a great number of concurrent system users. These systems must also perform well. There can be a thin line between a system that is business critical or business useless, and it is how the system performs which determines if you cross that line.”
There are many ways to determine how well a system is performing, but three key areas where big data performance questions stand out are:
- how fast new applications are developed and deployed;
- how well the system is executing these new applications; and
- if the system is performing as economically and efficiently as possible.
“To achieve these goals, every aspect of the application development chain must be looked at,” said Munshi. “Whether it is using a cluster analyzer to assess how data is being captured and processed, to monitoring resource utilization with tools to assure that processing is occurring under optimum system conditions, to assisting developers with debugging tools for their code and detecting areas in the software development chain that need the development of a formal policy, so that for instance, a random, low-priority job doesn’t get injected into a job stream that delays the mission-critical processing.”
SEE: The truth about MooCs and bootcamps: Their biggest benefit isn’t creating more coders (TechRepublic download)
LinkedIn is one company Pepperdata has teamed with to assist big data developers who are creating mission-critical applications. The two companies have taken LinkedIn’s Dr. Elephant project and integrated it with Pepperdata’s analytics toolset application profiler.
“What Dr. Elephant does is provide performance recommendations that big data application developers can reference while they develop software,” said Munshi. “It tells them what is going on in a big data processing cluster they are working with, and then gives them a list of performance tuning recommendations.”
The addition of application developer self-help tools like these couldn’t come at a better time. It is difficult to develop applications and to foresee every potential system bottleneck when you are working with an application that crosses many different servers, storage units, and networks, which is precisely the case with big data.
“A lot of this cross-checking can get skipped when developers are under tight deadlines,” said Munshi. “This is where automated tools can play a major role. They give big data application developers a better understanding of what is going on.”