Data Centers

Gain a measurable competitive advantage by ensuring enterprise data quality

Those enterprises that can ensure the information used to make management decisions is derived from quality data will have a definitive competitive advantage in the marketplace.

Technological innovations in the areas of communications and data networks have improved the proficiency, efficiency, and bottom-line of many enterprises. However, this massive amount of data flowing across the various networks in varying degrees tends to degrade the reliability and quality of the potential information that can be gleaned. Without good quality data, decision makers cannot be expected to make the best decisions. Those enterprises that can ensure the information used to make management decisions is derived from quality data will have a definitive competitive advantage in the marketplace.

The goal of Enterprise Knowledge Management: The Data Quality Approach, by David Loshin (Morgan Kaufmann), is to demonstrate that data quality is not an esoteric notion but something that can be quantified, measured, and improved. This introductory chapter outlines and defines data quality and its relationship to Knowledge Management and ROI.

Untitled Document

Enterprise Knowledge Management:
The Data Quality Approach

By David Loshin
ISBN: 0124558402
Publisher: Morgan Kaufmann
1st edition (January 22, 2001)
Pages: 493



In the following interview, TechRepublic asked David Loshin of Knowledge Integrity, Inc., for some additional information concerning the relationship between data quality and Knowledge Management, especially with regard to the IT professional.

[TechRepublic] Through the course of written history, there have been major events marked by reliance on what you classify as poor data quality. In other words, this is not a new problem. Contrary to what might be considered common sense, the advent of the information age and digital communications has really made poor data quality more prevalent, not less, hasn't it? Do you believe this is an inevitable result of the speed at which data moves in modern business, or, implementing the concepts expressed in your book, do you believe poor data quality can be conquered by better planning and practices?

[Loshin] You are correct—even though poor data quality has plagued people as long as they have used data, today's increased reliance on computers and communication, along with the speed at which information is created, exchanged, and consumed, has not just made poor data quality problems more prevalent, it has also magnified their impacts and ramifications. However, the problem is not an insurmountable one; I am confident that by applying the Knowledge Integrity approach to understanding the root causes of flawed data, eliminating the problems at their sources, and formally managing the business rules that are used to distinguish between "good" and "bad" data, one may define key data quality indicators that can be used to measure, and consequently control, the level of information quality performance.

[TechRepublic] In Chapter One of your book you take some time to give readers everyday examples of poor data quality and the problems, both large and small, that it can cause. Those examples really bring home the idea that poor data quality is not some theoretical, purely academic concept. Do you think enterprises have really accepted that poor data quality is a widespread problem? How do you overcome the inertia of corporate culture when it comes to making the changes necessary to establish enterprise knowledge management systems?

[Loshin] I have seen some important changes in management attitudes towards data quality over the past ten years. During the mid-1990s, it would not be unusual for a C-level manager to pay lip service to data quality as a key enterprise requirement, but the levels of investment in information quality improvement were limited in both scope and vision, typically depending on the use of a commercial-off-the-shelf software (COTS) data quality package managed by a single Information Technology (IT) staff member.

More recently I am seeing an increasing number of medium and large-scale companies dedicating resources to understanding the cost impacts associated with poor data quality, and some actually following through with building enterprise information quality centers of excellence organized along business lines instead of IT.

There are definitely change management issues involved in evolving this enterprise awareness of the value of improved information quality, and our company has often been engaged to apply the methods described in the book to expose and quantify the costs incurred due to data that does not meet expectations. A large part of the effort involves identifying how poor data actually impacts the business, both in simple, hard costs (e.g., scrap and rework) to what we refer to as "second-order impacts," such as flawed analyses and poor business decisions that seriously affect competitiveness. The key factor is being able to demonstrate how the investment in improved information quality will directly improve the bottom line. I would be surprised if any CEO would admit that demonstrating a high return on investment isn't part of their corporate culture!

[TechRepublic] Enterprise applications often serve as the interface for decision makers accessing organizational data. Do you think the makers of most large enterprise application systems have taken the time necessary to ensure that good quality data is collected by their software? Where have they failed?

[Loshin] People have a natural tendency to believe in the soundness of the processes in which their products are being used, and so it would not surprise me that many application systems take the quality of the information used within for granted. On the other hand, the individuals using these application systems must be able to get their jobs done, even when the application does not appropriately support all potential operational use cases. The upshot is that clever people will always find a way to get that data into the system, and this is often how invalid or inappropriate data is introduced into a system.

For example, a recent customer interaction system we assessed had an inordinate number of occurrences of the letter "X" in a telephone number field. It turns out that the application required a value for the telephone field and would not proceed to the next screen unless there was some value provided. In the cases where the data entry person did not have a phone number, some default value was provided in order to convince the application to proceed.

The disconnect between the application and the data does not exist in all cases, though. Especially with some products that are specifically used to move or share data, vendors are providing hooks to introduce data validation capabilities into workflows. And with the advent of data cleansing and auditing via Web services, we'll see a growing trend towards catching data flaws early in the workflow process.

[TechRepublic] The need to collect data, especially quality data, seems to be merely a matter of common sense, yet so many enterprises fail to set up adequate systems for accomplishing the task. Is this failure a reflection of the typical economic rules—well managed companies do well, poorly managed companies fail. Or do you foresee a day when the general adoption of good data collection techniques will be widespread with the result being that most business failures are caused by bad decisions based on quality data, rather than good decisions based on poor data? How do we reach that point?

[Loshin] There are a number of economic factors that affect the management of information quality. The first is that many organizations have institutionalized the expectation of poor quality data into apparently "normal" business operations. For example, reacting to acute data quality problems is incorporated into many job descriptions that we no longer recognize the scrap and rework as being a symptom of a latent problem, and therefore treating the symptom is confused with solving the problem.

A second economic issue is that very often the source of introduction of low-quality data is at the point of entry, and often those tasked with data entry are typically not fully integrated as stakeholders in the success of the organization (e.g., part-time hourly workers), and their incentives are based on volume, not on quality.

A third problem is that many organizations reward their staff members on a yearly basis, based on their successes in short-term tactical projects, while improving data quality requires a long-term, strategic approach, eroding the value of a personal commitment to information quality improvement.

I do have hope that as approaches that companies like ours have taken in helping our clients value information as an organizational asset will encourage a closer inspection of the short and long-term return on investment in data quality improvement. Those companies that see that their data used to run the business can also be used to improve their business will see that measurably high-quality information provides significant competitive advantage, and the methods I describe in my book will help these companies realize this advantage.

White paper resources

About Mark Kaelin

Mark W. Kaelin has been writing and editing stories about the IT industry, gadgets, finance, accounting, and tech-life for more than 25 years. Most recently, he has been a regular contributor to,, and TechRepublic.

Editor's Picks

Free Newsletters, In your Inbox