Technological innovations in the areas of communications and
data networks have improved the proficiency, efficiency, and bottom-line of
many enterprises. However, this massive amount of data flowing across the
various networks in varying degrees tends to degrade the reliability and
quality of the potential information that can be gleaned. Without good quality
data, decision makers cannot be expected to make the best decisions. Those
enterprises that can ensure the information used to make management decisions
is derived from quality data will have a definitive competitive advantage in
the marketplace.

The goal of Enterprise Knowledge Management: The Data Quality
Approach
, by David Loshin (Morgan Kaufmann), is to
demonstrate that data quality is not an esoteric notion but something that can
be quantified, measured, and improved. This introductory
chapter
outlines and defines data quality and its relationship to Knowledge
Management and ROI.

Untitled Document

 

Enterprise Knowledge
Management:
The Data Quality Approach

By David Loshin
ISBN: 0124558402
Publisher: Morgan Kaufmann
1st edition (January 22, 2001)
Pages: 493


 

Interview

In the following interview, TechRepublic asked David Loshin
of Knowledge
Integrity, Inc.
, for some additional information concerning the
relationship between data quality and Knowledge Management, especially with
regard to the IT professional.

[TechRepublic] Through
the course of written history, there have been major events marked by reliance
on what you classify as poor data quality. In other words, this is not a new
problem. Contrary to what might be considered common sense, the advent of the
information age and digital communications has really made poor data quality
more prevalent, not less, hasn’t it? Do you believe this is an inevitable
result of the speed at which data moves in modern business, or, implementing
the concepts expressed in your book, do you believe poor data quality can be
conquered by better planning and practices?

[Loshin] You are
correct–even though poor data quality has plagued people as long as they have
used data, today’s increased reliance on computers and communication, along
with the speed at which information is created, exchanged, and consumed, has
not just made poor data quality problems more prevalent, it has also magnified
their impacts and ramifications. However, the problem is not an insurmountable
one; I am confident that by applying the Knowledge Integrity approach to
understanding the root causes of flawed data, eliminating the problems at their
sources, and formally managing the business rules that are used to distinguish
between “good” and “bad” data, one may define key data
quality indicators that can be used to measure, and consequently control, the
level of information quality performance.

[TechRepublic] In
Chapter One of your book you take some time to give readers everyday examples
of poor data quality and the problems, both large and small, that it can cause.
Those examples really bring home the idea that poor data quality is not some
theoretical, purely academic concept. Do you think enterprises have really
accepted that poor data quality is a widespread problem? How do you overcome
the inertia of corporate culture when it comes to making the changes necessary
to establish enterprise knowledge management systems?

[Loshin] I have
seen some important changes in management attitudes towards data quality over
the past ten years. During the mid-1990s, it would not be unusual for a C-level
manager to pay lip service to data quality as a key enterprise requirement, but
the levels of investment in information quality improvement were limited in
both scope and vision, typically depending on the use of a commercial-off-the-shelf
software (COTS) data quality package managed by a single Information Technology
(IT) staff member.

More recently I am seeing an increasing number of medium and
large-scale companies dedicating resources to understanding the cost impacts
associated with poor data quality, and some actually following through with
building enterprise information quality centers of excellence organized along
business lines instead of IT.

There are definitely change management issues involved in
evolving this enterprise awareness of the value of improved information
quality, and our company has often been engaged to apply the methods described
in the book to expose and quantify the costs incurred due to data that does not
meet expectations. A large part of the effort involves identifying how poor
data actually impacts the business, both in simple, hard costs (e.g., scrap and
rework) to what we refer to as “second-order impacts,” such as flawed
analyses and poor business decisions that seriously affect competitiveness. The
key factor is being able to demonstrate how the investment in improved
information quality will directly improve the bottom line. I would be surprised
if any CEO would admit that demonstrating a high return on investment isn’t
part of their corporate culture!

[TechRepublic]
Enterprise applications often serve as the interface for decision makers
accessing organizational data. Do you think the makers of most large enterprise
application systems have taken the time necessary to ensure that good quality
data is collected by their software? Where have they failed?

[Loshin] People
have a natural tendency to believe in the soundness of the processes in which
their products are being used, and so it would not surprise me that many
application systems take the quality of the information used within for
granted. On the other hand, the individuals using these application systems
must be able to get their jobs done, even when the application does not
appropriately support all potential operational use cases. The upshot is that
clever people will always find a way to get that data into the system, and this
is often how invalid or inappropriate data is introduced into a system.

For example, a recent customer interaction system we
assessed had an inordinate number of occurrences of the letter “X” in
a telephone number field. It turns out that the application required a value
for the telephone field and would not proceed to the next screen unless there was
some value provided. In the cases where the data entry person did not have a
phone number, some default value was provided in order to convince the
application to proceed.

The disconnect between the application and the data does not
exist in all cases, though. Especially with some products that are specifically
used to move or share data, vendors are providing hooks to introduce data
validation capabilities into workflows. And with the advent of data cleansing
and auditing via Web services, we’ll see a growing trend towards catching data
flaws early in the workflow process.

[TechRepublic]
The need to collect data, especially quality data, seems to be merely a matter
of common sense, yet so many enterprises fail to set up adequate systems for
accomplishing the task. Is this failure a reflection of the typical economic
rules–well managed companies do well, poorly managed companies fail. Or do you
foresee a day when the general adoption of good data collection techniques will
be widespread with the result being that most business failures are caused by
bad decisions based on quality data, rather than good decisions based on poor
data? How do we reach that point?

[Loshin] There
are a number of economic factors that affect the management of information
quality. The first is that many organizations have institutionalized the
expectation of poor quality data into apparently “normal” business
operations. For example, reacting to acute data quality problems is
incorporated into many job descriptions that we no longer recognize the scrap
and rework as being a symptom of a latent problem, and therefore treating the
symptom is confused with solving the problem.

A second economic issue is that very often the source of
introduction of low-quality data is at the point of entry, and often those
tasked with data entry are typically not fully integrated as stakeholders in
the success of the organization (e.g., part-time hourly workers), and their
incentives are based on volume, not on quality.

A third problem is that many organizations reward their
staff members on a yearly basis, based on their successes in short-term
tactical projects, while improving data quality requires a long-term, strategic
approach, eroding the value of a personal commitment to information quality improvement.

I do have hope that as approaches that companies like ours
have taken in helping our clients value information as an organizational asset
will encourage a closer inspection of the short and long-term return on
investment in data quality improvement. Those companies that see that their
data used to run the business can also be used to improve their business will
see that measurably high-quality information provides significant competitive
advantage, and the methods I describe in my book
will help these companies realize this advantage.


White paper resources