A data architect defines “how the data will be stored, consumed, integrated and managed by different data entities and IT systems, as well as any applications using or processing that data.”
Developing a corporate-wide data blueprint that catalogs where various troves of data are stored, how they interact with systems and applications, how often they are refreshed, what the governance requirements for them are, and more is no small task.
It is rendered all the more difficult by the ways companies have accumulated their data — by buying disparate systems from many different vendors, each with their own data stores and by neglecting the need to integrate all of these data silos across the enterprise for many, many years.
Past practice has focused on immediate project goals and on what data is needed to be integrated to support a new application. No one ever thought much about how to get all of the enterprise’s data to work with each other.
Now, however, with the need to leverage more information across departments, companies want to break down their data silos. But, whether or not all companies should have a data architect is left to be said.
SEE: Hiring kit: Data architect (TechRepublic)
Data architect vs. CDO
The CDO’s role is to ensure the enterprise is getting the most business value out of its data. CDOs focus on data governance and compliance and work to ensure everyone within the enterprise has access to the data they need. However, CDOs don’t do the actual technical data integration architecting.
To actually break down data silos and move data across all systems, while ensuring everyone gets the data they need and guaranteeing sound data storage and management practices, a data architect with advanced database and data handling skills is needed.
The organizations that hire data architects tend to be large enterprises that have reached a point where mission-critical systems must be brought together to leverage data, and/or they have reached a level of pain to where they can no longer tolerate systems and data that run independently of each other.
One case in point is a large West Coast utility that reached such a level of pain from not being able to integrate data and systems that it hired a data architect to sort through the confusion and to break down data silos. The data architect was dedicated to the job of orchestrating data flows, storage and governance/administration because there was no one else on staff in either the database or the application group who could take on such a prodigious job.
Addressing data architecture if you can’t afford a data architect
Most organizations experience the pain that this large enterprise did, but they lack the resources to hire a full-time data architect. What can they do?
Inventory data and systems
A junior staff member can take on the job of inventorying all of the data across the company, such as which systems contain the data, where the data is located, who uses the data, and more. The information can be placed into an asset management system for purposes of tracking the data.
Use zero-trust networks
A zero-trust network monitors users across the enterprise, admitting only users who are authorized to access certain systems or data. The network also notifies IT if any new device, system or data store is added to the network.
The zero-trust network is a good way to make sure you have all data and systems that are in use in the company, so you can track them. This system can be managed by your network group, with updates to your asset management system as needed.
Prioritize data architecture moves based on mission critical and pain point needs
If your resources are limited, you can still work on an overall data architecture by taking smaller steps. These steps can be taken when you install a new system or when you prioritize data integrations and silo takedowns based upon mission critical and pain point needs.
It might take longer to arrive at an end-to-end data architecture, but you’ve started the journey.
Subscribe to the Data Insider Newsletter
Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more. Delivered Mondays and Thursdays