Harvard study lays out best data strategies for multi-cloud environments

Harvard Business Review released an in-depth study on what it takes to have a top-of-the-line strategy for big data.

Managing the multicloud: How to manage the new default for enterprise computing The ZDNet and TechRepublic special feature about multicloud looks at managing multiple cloud providers, how to play them off each other, and what vendors and tools can help you manage multiple clouds.

Data retention and management are becoming key drivers of success and growth for companies across the world. The ability to collect information about your business, optimize it and then put it to good use has become integral to the rise and fall of organizations.

Harvard Business Review released a study explaining the difficulties associated with strategizing how to use your data in multiple environments and ways companies can get past them.

"Data, the lifeblood of all business, is pouring in from sensors, networks, applications, and an ever-expanding horde of connected devices. But do enterprises even have access to all their data or are they blind to the data that matters most?" said Arun Murthy, Cloudera's chief product officer. 

Murthy explained the study, named Critical Success Factors to Achieve a Better Enterprise Data Strategy in Multicloud Environment, in detail during a speech at Cloudera's Analyst Conference in New York City on Tuesday.

"This report reveals specific obstacles modern enterprises must overcome to realize the true potential of their data and validates the need for a new approach to enterprise data strategy," he added.

SEE: Special report: The cloud v. data center decision (free PDF) (TechRepublic Premium)

The study found that while most company executives understand the value of data and the need for better strategies around it, most were still moving slowly to change things. 

Almost 70% of organizations recognize that a comprehensive data strategy is required for meeting business objectives, yet just 35% believe their companies are capable of handling it with their current analytics and data management systems.

For the most part, companies were focusing their initial data efforts on business intelligence and data warehousing while less than half planned to incorporate artificial intelligence and machine learning.

The Harvard study also notes that the need for more sophisticated data strategies was quickly becoming a requirement thanks to new regulations like the GDPR and others coming down the pipeline. Organizations now have no choice but to think about how their data is organized, secured, where it is kept and who has access to it. 

This process in itself was helping a lot of companies realize just much data they have and what it could potentially be used for. 

"Organizations have traditionally had to secure their data and control access to it. Now, they need to orchestrate governance and security across different infrastructure and storage systems," the Harvard study said. 

"It is a question not only of adhering to the growing range of regulatory requirements but also establishing a broad governance policy that encompasses the ways that people, processes, and different technologies work with data in a compliant, auditable, and secure way."

Almost 80% of respondents said they were required to secure data within a regulatory framework and more than 50% told Harvard they expected to face new data privacy regulations in the near future. The GDPR was mentioned most often by respondents, 61% of whom are directly affected by it. 

Despite the need for change, half of the companies surveyed said they did not feel their current cloud service providers were able to meet their need for access to open-source software. 

The study said these platforms had to be more open in order to protect systems from becoming siloed and shut off from the larger data pool.

Multi-cloud environments were increasingly becoming the only option for companies looking to do a variety of things with their data. But this could be difficult for companies like PricewaterhouseCoopers (PwC), which has offices across the world that are subject to different data laws.

Mike Flynn, a partner at PwC, who spoke at the Cloudera conference, said managing data resources was a challenge because they wanted to centralize their data while giving their firms across the world some territorial sovereignty "in terms of where their data is located, the way that applications can access the data, and the way that it's encrypted."

Flynn added that using multi-cloud systems allowed PwC "to federate and deploy our cloud wherever we need it."

"One tool can push containers of applications into a cloud environment, even if we can't actually see any of the data, or what they're actually doing in their environments," he said.

Other companies had similar struggles, especially those with huge data sets collected in a number of different areas. 

Ranjith Raghunath, senior director of of platforms, engineering, innovation and support at pharmaceutical giant GlaxoSmithKline, said new, varied data management strategies were key for research and development projects.

"Our thought was, how do you bring this all together, log the metadata and make it really useful so that you can give a core capability to the scientists that, when they're asked a descriptive question, can open up a tool and answer it. That was our call to action to set up a data platform," Raghunath said. 

"We've been able to break silos . It takes 6-12 years to develop a drug. It costs $2.6 to $2.8 billion to create a drug. Our call to action is how do we use data technology and technology to move that process forward and Cloudera is at the center of that," Raghunath said. 

Cloudera recently created a platform that makes it a bit easier for companies to undertake these kind of projects and allows organizations to collect as well as store data in a number of cloud environments.

Nearly half of all companies who spoke to Harvard said they keep less than 25% of their data in the public cloud. Almost 50% said they did this because of issues with legacy software compatibility as well as security and cost constraints.

Companies were essentially scared off by the price and security worries that come with cloud-based data management systems, but the study said moving to the cloud was the best way to create synchronized data platforms.

Flynn from PWC echoed those comments, describing his company's new approach to data management and how it has affected every aspect of how their business runs.  

"We want all of our workloads to be  cloud native or cloud ready. We want all of our capabilities to be ephemeral, meaning we can spin them up and spin them down when they're needed or not needed," he said.

"That ability is what's going to drive the cost savings of being able to operate at scale globally, but to be able to do it cheaper than if I had all these different data centers spread around the world. I have to be able to turn off storage and programs when we aren't using them."

In the survey, 80% of companies admitted that they were avoiding using cloud systems because they feared being stuck to one cloud vendor, a problem they coined "vendor lock-in." 

Two thirds of respondents said they needed better data management policies in order to meet the company's goals, yet just 30% said their analytics and data management capabilities were up to par.

Gil Genio, chief technology and information officer at Filipino company Globe Telecom, touted Cloudera for helping his company move closer to the kind of data management systems they were looking for.  

"Now, we are able to manage very large data sets. This foundation has helped us with our wireless service and improved fraud detection. We used to have a bad habit of sticking to 'black box' applications. When you try to scale it becomes very difficulty," he said onstage. 

"We hope the promise of the Cloudera Data Platform is to make everything seamless. There has been a lot of heavy lifting in the company over the past few years, but we're excited about the promise of CDP moving forward to simplify all of that so we can turn our attention away from trying to figure out what the platforms are doing to further serving even more use cases in the future."

More than half of the executives in the survey said their company had multiple copies of data and nearly impenetrable silos. More than 40% said a lack of interoperability was holding them back from fully investing in cloud-based data management architectures.

Also see

How to become a data scientist: A cheat sheet (TechRepublic)
60 ways to get the most value from your big data initiatives (free PDF) (TechRepublic)
Feature comparison: Data analytics software, and services (TechRepublic Premium)
Volume, velocity, and variety: Understanding the three V's of big data (ZDNet)
Best cloud services for small businesses (CNET)
Big data: More must-read coverage (TechRepublic on Flipboard)

istock-847311142.jpg

Image: artisteer, Getty Images/iStockphoto