Big data is the backbone of modern business, but before it can be used, it has to be properly managed. Here's an overview of the ins and outs of data management.
Every business in the world has to contend with data. From a single-person LLC to multinational enterprises, data is everywhere, and it needs to be properly managed to be an effective business tool.
Data isn't just customer records and other externally sourced information, though--employee records, network maps, payroll data, and other forms of external and internal information all fall under the list of data that has to be managed.
It takes a lot of work to turn data into something usable. Without proper management, you can end up with duplicate records, incorrect information, wasted time and storage space, and a host of other problems that come with poor organization. Digital data is a lot more complicated than paper, so it requires specialized skills to organize it.
Enter the world of data management. Here are the essentials about data management, including models, software, implementation, and more. This article is also available as a download, Cheat sheet: Data management (free PDF).
What is data management?
There are as many ways to define data management as there are websites that focus on it. DAMA International, a consortium of data management professionals, defines data management as "the development and execution of architectures, policies, practices, and procedures in order to manage the information lifecycle needs of an enterprise in an effective manner."
In other words, data management is multidisciplinary and keeps data organized in a practical, usable manner. At its most fundamental level, data management works to ensure that an organization's entire body of data is accurate and consistent, readily accessible, and properly secured.
Along with being a way to eliminate duplicates and standardize formats, data management also lays the groundwork for data analytics. Without good data management, analysis is practically impossible at worst and unreliable at best.
How master data management brings order to big data (TechRepublic)
What is involved in a complete data management model?
If the definitions and descriptions of data management make your head spin a bit, it's understandable--there is a lot that goes into the practice of data management.
DAMA International breaks data management down into 11 knowledge areas:
Data governance, which is the planning of all aspects of data management. This commonly includes ensuring availability, usability, consistency, integrity, and security of data managed by an organization.
Data architecture, or the overall structure of an organization's data and how it fits into a broader enterprise architecture.
Data modeling and design, which covers data analytics and the design, building, testing, and maintenance of analytics systems.
Data storage and operations, which is concerned with the physical hardware used to store and manage data.
Data security, which encompasses all elements of protecting data and ensuring only authorized users have access.
Data integration and interoperability, which includes everything to do with the transformation of data into a structured form (i.e., in an organized database) and the work necessary to maintain it.
Documents and content, which includes all forms of unstructured data and the work necessary to make it accessible to, and integrated with, structured databases.
Reference and master data, or the process of managing data in such a way that redundancy and other mistakes are reduced by standardizing data values.
Data warehousing and business intelligence, which involves the management and application of data for analytics and business decision making.
Metadata, which involves all elements of creating, collecting, organizing, and managing metadata (data that references other data, like headers, etc.).
Data quality, which involves the practices of monitoring data and data sources to ensure quality information is being delivered, integrity is being maintained, and poor quality data is being filtered out.
All of these elements have to be included in a total data management model; if even one element is missing, some aspect of managing data is complicated, if not damaged entirely. For instance, if you get rid of metadata management, you lose the ability to easily categorize data. Without data quality being ensured, the entire data structure becomes suspect, and analytics become useless. Eliminating integration and interoperability would make it nearly impossible to combine disparate forms of data into a usable whole.
10 signs that you might have a data governance problem (TechRepublic)
Why businesses are at risk over poor data privacy practices (TechRepublic)
How does data management fit into a larger big data model?
If an analytics model is the product made from a business's data, then data management is the factory, the materials, the supply chain--everything that goes into making the product.
You can't have a big data model without data management--trying to do so would be like saying your messy desk is perfectly organized chaos in which you can find anything; in time, you're bound to lose something important.
SEE: 60 ways to get the most value from your big data initiatives (free PDF) (TechRepublic)
Data management is a total lifecycle system that follows data from the moment it's created until it ceases to be useful. Data management tracks the data from place to place, monitors the transition of data from one form to another, and ensures that nothing important is left out of a business analytics model.
In short, data management doesn't just fit into a big data model--it's the umbrella under which all big data falls.
Why enterprises are finally paying up for big data security (TechRepublic)
6 tips for creating effective big data models (TechRepublic)
Big data policy (TechRepublic Premium)
How to become a data scientist: A cheat sheet (TechRepublic)
What skills do data management professionals need?
There's no mistaking the essential role that data plays in the modern business world. Big data professionals need to have particular sets of skills that make good data management possible.
A data management team needs several people who are adept at certain elements of the entire end-to-end management chain. The skills a data management professional should be trained in include:
General computer science: A qualified data management professional should be trained in the basics of computer science--they're going to be spending a lot of time using basic skills to organize data.
Database programming: Some of the most important database languages in the data management world include SQL, Python, R, Hadoop, XML, and PERL. Be sure to learn at least one of these languages and get familiar with its corresponding database platforms.
BI/BA: Business intelligence (BI) and business analytics (BA) are at the core of why companies collect and organize data. Data management pros should be able to understand the hows and whys of analytics.
Cloud computing: Data hosting can take up a lot of storage space, which is why many businesses turn to the cloud to host, manage, and analyze their data. Skilled data management professionals should be familiar with AWS, Microsoft Azure, Google Cloud, IBM Cloud, and other major platforms.
Machine learning: Data analytics, in particular its later stages like predictive analytics and prescriptive analytics, make extensive use of machine learning technology to decrease the computing time needed to deliver results.
Data management certifications: Data management is a science in and of itself, and there are several certificates that data management professionals can pursue. DAMA International offers the Certified Data Management Professional (CDMP) certification. Oracle, IBM, and others also offer certifications.
Soft skills: Making use of data requires a lot of collaboration with non-IT departments to plan and execute big data strategies. Good writing, speaking, and innovative thinking are an essential set of skills for successful data management professionals.
Soft skills: A business user's guide (free PDF) (TechRepublic)
4 ways to improve big data project management (TechRepublic)
How to make your business a big data leader: 5 steps (TechRepublic)
What data management software is available?
Data management can't be done in a haphazard way--organizations need to invest in a data management platform that can deliver all the results they need to be successful in managing and using data.
There are a number of data management platforms, each with its own unique features and industries in which it fits. Some of the top platforms include:
Some platforms, like Google Cloud's big data analytics software, aren't specifically built to do data management, but that doesn't mean they can't do it. In the case of Google Cloud, all the necessary software is present, but it needs to be configured to function as a data management platform.
As with any major software platform, choosing the right one from the onset can make a huge difference in an organization's success. Make sure that when deciding on a platform, your data management team has a good understanding of the kind of data you have, how you want to host it, and what your end goals for data management are. Armed with that information, a data management team can make the best choice possible for the needs of their organization.
The top 10 big data frameworks used in the enterprise (TechRepublic)
Big data's biggest challenges: 3 solutions (TechRepublic)
How can organizations get started with data management?
There may seem like a million and one pieces to planning a data management initiative, but don't get bogged down in the weeds: Planning to integrate data management into your organization is just like any other business transformation project.
First, make sure your data management initiative has a clear goal: To what end are you trying to organize your data? A business that wants to use data to make internal changes, for example, will have different data management needs than a company that wants to use its data to increase sales.
Once you have a stated goal, it's time to think about what will be needed to make it happen. If your data exists entirely as unstructured files and documents, you're going to have a different starting point than an organization with large Hadoop databases filled with well-organized records.
Consider all the possible needs: Reassignment of employees, new hires, training, software platforms, budget, timeframe, the types of data already on hand, the kinds of data that are needed, and more. Having all these elements in mind will help you when you actually start planning in earnest.
SEE: 4 ways to help users acclimate to big data through training (TechRepublic)
Next, it's time to put your talent in place. Hire new employees, reassign those who are going to start working on your data management project, and get the team acquainted with your data management goals.
Once your data management team is in place, it's time to start the planning phase. Outside of how the team is going to accomplish its goals, this is when a data management platform is chosen, training can be undertaken, and the whole model starts to come together.
After this, your data management team should be well on their way to building, testing, and implementing a full data management model. When all of those prerequisites are in place and data management is an integrated part of your business, it's time to start thinking about what comes next: How all of that well-organized data can help transform your organization, internally and externally.
The entire process of building a data management system can take a long time, and even then, data management is just the groundwork for further use of big data.
Data management isn't an end in and of itself: It's the house in which an organization's data lives. It's up to that organization to make use of the house it built by putting that data to work.
- How the right uses of big data can help your business flourish (TechRepublic)
- Volume, velocity, and variety: Understanding the three V's of big data (ZDNet)
- How companies can use big data for social good (TechRepublic)
- Feature comparison: Data analytics software, and services (TechRepublic Premium)
- Volume, velocity, and variety: Understanding the three V's of big data (ZDNet)
- Best cloud services for small businesses (CNET)
- Big data: More must-read coverage (TechRepublic on Flipboard)