Image: AykKokcu, Getty Images/iStockphoto

If you want to achieve actionable and impactful insights from your big data, you must have data aggregations that are highly relevant to a particular data algorithm that evaluates them. When this happens, data breakthroughs occur.

A barrier to this kind of insight is data silos, which exist in virtually every company. But breaking down those silos is still a major work in progress for most organizations. Here’s how your enterprise can overcome four key challenges of breaking down data silos.

SEE: The data scientist job interview: Questions to expect and questions to ask (free PDF) (TechRepublic)

1. Hidden data troves

Almost every business department has data acquired through an off-the-shelf database, or an application that the department independently purchased and installed on its own. Often, this “off the grid” data is a workaround to the existing data and processing limitations of corporate systems that fail to give a department what it needs to optimize its productivity. Sometimes internal resistance to sharing data stalls progress, too.

This creates data silos, which make it difficult for companies to account for all of their information assets. Data siloing limits a company’s ability to use data from a variety of sources for analytics.

SEE: Building an effective data science team: A guide for business and tech leaders (free PDF) (TechRepublic)

A good approach to breaking down data silos in this situation is implementing a corporate-wide IT asset management and tracking system. Since accounting for all data and IT assets everywhere involves inventorying all of the data and IT that every department has, the CEO and other top-line management should be supportive of corporate-wide IT asset management before any system is started. It’s the CIO’s job to secure this buy-in from the top.

2. Random vendor interaction

With the growth of shadow IT, vendor software and databases can come through virtually any departmental door. Systems from different vendors that departments independently buy don’t necessarily interact well with each other. When this occurs, systemic data silos can arise because of cross-system and data integration failures.

The best way to address this issue is to require interoperability and a full set of application programming interfaces (APIs) in the requests for proposal (RFP) that IT and individual business departments issue to vendors.

One way to assure that system and data interoperability is a front-page requirement on RFPs is for IT to create a standard RFP that is required by purchasing or whichever department authorizes tech purchases. This standardized form can be used by IT and end-user departments.

3. The wrong set of integration tools

Most systems and databases sold by vendors have some type of APIs for data integration; however, totally seamless integration and the ability to easily aggregate data from disparate systems can never be assumed. It becomes necessary for IT and end users to use toolsets like robotic process automation (RPA), which is primarily an end-user software tool to automate business processes and simpler data transfers, or extract, transform, load (ETL), a more complex data sharing software used by IT that often requires coded logic to transform data into the forms in which it is needed.

No one tool performs every integration task required for data aggregation, so it is up to IT to learn about these tools, test them, and determine a tool bench of software that can address any data integration and aggregation problem within the company.

4. Non-digital data

Most companies still have paper, film-based video, photos, and drawings that are stacked away on shelves, in closets, or in off-site storage. At any given time, it could become necessary to track down and aggregate this non-digital data, whether it is for the discovery process of a lawsuit or a review of paper-based medical records.

There is no instant solution. You have to identify the data and documents that are needed and then digitalize them into a form that can easily be aggregated into the data repository you are building.

Overcoming these challenges will help you break through the silos and help your organization achieve good data analytics results.