For many years most organizations have maintained central repositories of information such as customer details, product lists, sales records, etc. The benefits of having this core information in a central location are obvious, and its benefits to all parts of the organizations it supports, from top management to the guys on the ground, is incalculable.
More recently, several organizations and companies have begun investigating the next stage in the evolution of these central reference systems. This article seeks to outline the reasons for moving to this more sophisticated system, and the things to consider before your organization makes the migration.
The users of these existing central reference systems generally fall into three distinct groups:
- The Consumers: End users or systems that use the data found in these systems to support their own work
- The Contributors: Users or systems that can update data in the central system as part of their normal activity
- The Data Managers: Users whose role it is to maintain the integrity and oversight of data and the metadata used by the system
It is the combined harmonious working of these three groups that keeps the system current and relevant.
However, it is easy for each of these systems to become disconnected from each other and with the other master data; become out of date as changes are made in other systems, and these changes are not replicated to all systems; or experience a delay in the replication. This kind of disconnect can potentially lead to the usefulness of these systems being undermined and in the worst case, an end user making a decision based on the incorrect data.
One solution to this is to build a single master reference system to which all the master systems within the company publish their data and from which they can collect the master data that they need to use.
Let's use a generic corporation as an example to illustrate the benefits of this approach. This company—call it Company X—already has several identified and acknowledged master data systems for different types of data:
- ProdMan: A Product Management tool that stores information on what the company sells and is the key system for information such as product name, branding, product information documentation, and the like
- FormMan: A Formulation Management tool that stores information on how to make what the company sells and is the key system for information such as the ingredients and recipe which make a Company X product, including any safety documentation for the individual components or final products that may be required for correct and safe use and storage
- GraphMan: A Graphics and Branding Repository that stores graphic files for all Company X's brands, products, corporate logos, and the like, as well as detailed information on when and where they can be used; these images have been created by in-house and third-party designers
- HRMan: A Personnel Management tool listing all employees, contact details, etc.
- LabelMan: A Product Label Management tool which manages the creation of labels
- PackMan: A Packaging Management tool that manages the packaging used on Company X's products
As you can see, each of these systems is the master for its particular specialty of data. On the surface it would appear as though Company X had a good grasp of its data and its management.
However, we can not be sure that every recipe stored in FormMan is being used as part of a product that Company X currently sells—and is being managed by ProdMan. Also, we can not easily tell if we have a label for each product that ProdMan says that we are selling. How would we add a new product? At present this would be done multiple times though multiple different methods and screens in each system.
As each system is separately managed and maintained, we could have differences in data due to keying errors; differences in usage of upper and lower case, spacing, etc.; or linguistic differences such as those between American and British English.
We could do a reconciliation exercise across these applications, but that would take a long time. In most cases, we have an acknowledged master system so we could correct the other systems where appropriate, but for some data elements, we would not have this, and we would have to find other ways of working out which was the correct value.
Like any company of its size, Company X uses various electronic means such as EAI, ETL, and EDI to move data around internally. At present, the interfaces for each of these systems would need to be built and maintained individually at significant cost to Company X in terms of money and resources. Also, any change in any of the systems could have a major impact on the interfaces.
A ray of sunlight
Looking at this picture, Company X would like a solution to ensure consistent and correct data throughout all spheres. Obviously, there is no one single system which Company X can purchase which will hold and manage all of its data requirements. The effort to create one, in terms of the development time and the retraining of users, would be a significant.
What Company X could do is to create a central repository for its master data, which would load each system's own master data into itself at regular intervals. This is done without removing any of the current systems within Company X's inventory.
This then provides end users within Company X with a single system which can show them the complete set of information about a product. This single presentation layer would be similar to the concept of the corporate-dashboard type data displays which are currently quite popular.
The next step is to allow them to manage the data using this system, and have their changes pushed back automatically to the relevant systems. In this case, of course, only certain people would have the right to update data.
Company X now has a single system which provides a composite view of the information about all of its products and allows changes to be made in a uniform way to all of its underlying systems.
Additionally, each system now talks to any other via the central system; for example, the LabelMan pushes an update to the labels used for Product X into the central system, from where ProdMan collects them and updates its own information concerning Product X. On the technical side, this now means that if any of the underlying systems change, then only their link to the central system needs to be checked. This is a much simpler and significantly more manageable situation for Company X.
This communication between the central system and the underlying systems could be done either by using custom code interfaces in any language you like or via a technology platform such as EAI or ETL, depending on your application landscape.
This is a significant step forward from where it was before.
A nice puppet, but no soul
At this point, the central system is acting only as a point of reference for all of the Master Data within Company X. The system itself does not create any of the data; it simply provides a composite overview of data stored within the underlying systems and a method by which this data can be updated.
However, to add a new product, Company X still has to manually add it to each of the systems before it appears in the central system, and each system has its own unique user interface and naming conventions.
If Company X decided to upgrade its central system to allow it to create new products, these skeletons would then be passed down to be fleshed out by the appropriate users of the underlying systems before being brought back to the central system in a complete form. To facilitate this and to ensure that everyone knows which record is which, Company X introduces a new naming convention for its products, this ID being generated by the central system, which becomes the point of origin for this data.
New for old
As part of Company X's ongoing IT program, upgrades are made to all of its systems. Now that the central system is in place, each system no longer needs to store its own set of master data as metadata for itself, but can load it live from the central system. Additionally, some of the functionality in these systems may me migrated from the underlying systems to the central system. In some cases, it may even be possible to remove entirely an underlying application after a period of time.
With the key management of Company X's products now being done via the central system, it is possible that the key functions previously performed by Company X's ProdMan are no longer done in that system, possibly to the extent that ProdMan becomes redundant for some, if not all, of its functionality.
How does Company X benefit?
Company X IT also benefits significantly due to the speed at which it can implement and integrate new systems. Each new tool only needs to concern itself with its own functionality and specific data, loading the rest from the central systems.
The Company X business benefits because it has current and accurate data on all of its products available to aid in decision making and planning. Company X can also make slices of this available to its customers and suppliers either at a cost or as a value-added service.
Updates to a variety of information, such as a customer address or product name, for example, can be made once in the central system rather than once in each underlying application, which has its risks.
What are the potential headaches?
Until an underlying system is in a position to take the master data it uses as metadata from the central system, there are likely to be issues with the structure and management of the data in each of the underlying systems. For example, if ProdMan only allows a product to be classified against a certain hierarchy, any changes to that hierarchy must be made in the central system and ProdMan before any data containing the new hierarchy can be passed to ProdMan.
As we are trying to keep all the systems in sync with each other, this means that no product using the new hierarchy can be passed to any underlying system until they are all ready. This is highly likely to lead to delays in changes being implemented and will probably require a significant mindset change on the part of Company X's IT and business users.
As each system is linked to the central system a complete analysis will need to be done to identify the data within the system, which fields it will export to and which fields it will need to import from the central system, and the impact of these changes. Once the link is made, fields inherited from the central system should no longer be editable within that system. Users may also need to be retrained on how the system works.
Under the hood
What are the components of this central system? So far we have seen a composite display of a variety of data elements in a common user interface to be viewed and managed by the employees of Company X. We have also seen a series of interfaces, by which data can be pushed to or pulled from the central system, ensuring that both it and its underlying systems are constantly up to date. But, what is beneath this layer?
The central system has two main components:
1. The core
This is the heart of the central system, comprised of:
- The central system's main database, storing all of the information it has received from the underlying systems as well as its own data and fields populated by a mixture of the two.
- A single user interface—probably Web based—by which users can view and manage the data within.
- An interface manager, which oversees the interactions between the underlying systems and the central system.
2. The virtual
This comprises systems and repositories which, while not considered part of the Core, are part of the overall concept. The user interface for these would reflect that of the Core. These could include Change Control workflows or a repository of images used in Company X's products.
These two elements together form the concept of what I refer to as a centralized Master Data Management System (MDMS) for the company (Figure A).
How to manage the monster
Given the sheer scope and size of this kind of central system, it will be almost impossible for one person to manage the project. However, a closer look at the components involved allows us to identify three main spheres of work and responsibility with this type of system:
- Technical: Responsible for development, infrastructure, networks, etc.
- Data Manager: Responsible for managing the Master Data, e.g., Corporate Data Architect
- Process Manager: Responsible for the process control side, including links to and from other systems
In conjunction with these three, I suggest that you have a senior manager or board member to provide oversight and assist in the planning and management of the entire project. In some cases, you may also be able to allocate discreet sections of the work to other project managers and developers under the supervision of this core group.
The path is long and stony
This concept of a single tangible and virtual central system is wonderful in theory, but can be complicated when it comes to execution. The sheer number of systems that need to be integrated into it, the number of business processes that need to be adapted to it, and the mindset changes required for both IT and the business are significant hurdles to the project. As such the path is often very long.
One Project Manager I know describes the MDMS as a Pandora's Box because it seems like a good idea at the time, but the moment you open it up, it never ends, and you can't get it all back in the box. Many people in both business and IT may initially only see the benefits of the end result of an MDMS, but do not see the pain involved in getting to this situation.
That is why planning is so important. To get to the end of an implementation like this, it takes a long time; therefore it is important to plan carefully and keep the business community involved in the project by providing some quick wins along the way.