At its heart, data modeling is about understanding how data flows through a system. Just as a map can help us understand a city’s layout, data modeling can help us understand the complexities of a data system, its structures, formats and handling functions.
SEE: Job description: Big data modeler (TechRepublic Premium)
By diagramming the flow of data, we can identify bottlenecks and inefficiencies. We can also spot opportunities for improvement. Data modeling lives on after the database has been created and deployed, helping us to keep track of changes and adapt our systems accordingly. But in order to understand and leverage data models to their utmost advantage, it’s important to first understand the different types of data models and what they can do.
- Why use data models?
- The different types of data models
- Levels of data abstraction
- Considering different model types and strategies for your business
Why use data models?
From a business perspective, data modeling provides several important benefits. It helps to create an efficient and logical database by eliminating redundancy, which saves storage space for large organizations with extensive digital assets.
The data modeling process also gives every system access to a single source truth, ensuring accurate reporting on all levels, from analytics through visualization. Consequently, data modeling is a critical process in the development of any digital enterprise that wants to become more data-driven.
The different types of data models
There are several types of data models that businesses can use. The three most common types are relational, dimensional and entity-relationship (ER).
The most popular database model format is relational, which stores data in fixed-format records and organizes it into tables with rows and columns. The most basic data model has two components: measures and dimensions. Raw data can be a measure or a dimension.
- Measures: These numerical values are used in mathematical calculations, such as sum or average.
- Dimensions: Text or numerical values. They aren’t used in calculations and include locations or descriptions.
In relational database design, “relations,” “attributes,” “tuples” and “domains” are some of the most frequently used terms. Additional terms and structural criteria also define a relational database, but the significance of relationships within that structure is what matters. Key data elements (or keys) connect tables and data sets together. Explicit relationships such as parent-child or one-to-one/many connections can also be established.
A dimensional model is a type of data model that is less rigid and structured than other types of models. It is best for a contextual data structure that is more related to the business use or context. Dimensional models are optimized for online queries and data warehousing tools.
Crucial data points, such as transaction quantity, are called “facts.” Alongside these facts are reference pieces of information known as “dimensions,” which can include things like product ID, unit price and transaction price.
A fact table is a dimensional model’s primary table. Retrieval can be quick and effective because data for a specific activity is kept together. However, the absence of linkages can make analytical retrieval and data usage difficult.
Entity-relationship (ER) model
The entity-relationship model is a graphical representation of a business’s data structure. It contains boxes with various shapes and lines to represent activities, functions or “entities” and associations, dependencies or “relationships,” respectively.
The ER model provides a framework for understanding, analyzing and designing databases. This type of data model is used most often to design relational databases.
In an ER diagram, entities are represented by rectangles, and relationships are represented by diamonds. An entity is anything that can be identified as distinct from other things. A relationship is an association between two or more entities. Attributes are the properties or characteristics of an entity or a relationship.
ER diagrams can be categorized into three types: One-to-one, one-to-many, and many-to-many relationships.
- One-to-one relationship: An example of a one-to-one relationship would be a Social Security Number (SSN) and a person. Each SSN can only be assigned to one person and each person can only have one SSN.
- One-to-many relationship: An example of a one-to-many relationship would be a company and employees. A company can have many employees, but each employee typically only works for one company.
- Many-to-many relationship: An example of a many-to-many relationship would be students and classes. A student can take many classes and a class can have many students enrolled in it.
Levels of data abstraction
With all types of data models, there are also different layout permutation possibilities. These three kinds of modeling levels for data abstraction are the most common:
Conceptual data model
The conceptual data model is the highest level of abstraction, representing the overall structure and content of a database but lacking detail about the data. It contains a description of the data but not the actual data itself. This type of model aims to show how data flows within the organization, capture the business requirements and define what types of data are needed.
Logical data model
The logical data model contains more detail than the conceptual data model and includes all the entities, relationships, attributes and rules that apply to the data. This type of model is used to design the database.
Physical data model
The physical data model contains all the specifics about how the logical model will be implemented. This model format includes table names, column names, types, lengths, primary keys, foreign keys, indexes and relationships.
Considering different model types and strategies for your business
The previously mentioned data model types and formats are the most popular, but they aren’t the only ones that exist for business use. Some companies will opt for hierarchical, network, object-oriented and/or multi-value models, depending on their specific situation and business use cases.
Regardless of what data model(s) you choose to bring into your company’s data strategy, it’s important to have the right people and processes in place to make these models work. Hiring a big data modeler is a good first step to selecting and operating successful data models for your business.
READ NEXT: Top Data Modeling Tools (TechRepublic)