The "one truth" philosophy says that it is desirable
to have a single definition for a data element or business term, that
there should be a common, shared definition for your master reference
data and perhaps even your major business entities. The "One Truth
Above All Else" anti-pattern occurs when this philosophy is taken to the
extreme and you seek to get to the one truth about all data entities and
data elements within your environment. The challenge is that to
get to the "one truth" about something, when it is even possible, often
requires significant effort. When this effort goes beyond the
point of diminishing returns and provides negative value to your
organization you have a significant problem on your hands.
Seeking the one truth is often an integral part of a
master data management (MDM) strategy where you strive to use the most
correct and current data available to you. To achieve this you need to
manage both the data itself by information about the semantics, source, and
quality of the data. This information is often referred to as metadata.
In the past MDM was thought of as primarily an issue for
warehouse (DW)/business intelligence (BI) efforts, but with the growing
importance of service oriented architecture (SOA) this is clearly important for
mission-critical real-time systems as well.
The goal of MDM is to rationalize the data stored in disparate systems, a
difficult but worthy task. Business stakeholders can benefit from
consistent data. Mergers and acquisitions often fail when data cannot be
reconciled. Support calls concerning data quality problems are avoided
when the source problems are actually addressed.
"Analysis paralysis" occurs when the
team, or more often the data professionals on the team, discover that it's
incredibly difficult and time consuming to get people to agree to the "one
truth" about their data. This seems to be particularly true of DW/BI
projects, perhaps because they touch so many data sources that they very clearly
see that there are many versions or interpretations of the same basic concepts.
Sadly, experience seems to show that organizations are
significant amounts of effort seeking the one truth yet earning little or
often negative return on investment (ROI) for that effort. This is because
the project teams get hung up trying to get the data perfect instead of focusing
on delivering the high-quality working software which provides value to it
Taken in moderation attempting to seek the one truth can provide value, but
when you take it to the extreme several negative side effects occur:
- Your organization's competitive position can be eroded by enforcing a
consistent viewpoint. The fact is
that various portions of your organization have different ways of working,
different priorities, and different constraints. There may not be one single
shared truth, and even if there were, it's going to change over time anyway. A
great example of this are
advertisements which show two similar pictures and below each picture a different word
(Figure 1 is a picture that I took in a hallway in
London's Heathrow airport). Then the
ad shows the same two pictures
again, but with the words switched. The point is that people see the world
differently, and that as a financial institution, they understand that and are
flexible enough to act accordingly.
- The development team abandons the effort. Modern development is
evolutionary, not serial, in nature. The rest of IT doesn't have the
time to wait for the data professionals to get to the "one truth" before
continuing with actual development. If getting to the one truth
impacts the project timeline then the development team will often choose to
continue on without the data professionals. Sometimes the data
modeling effort is abandoned, but more often it will continue in parallel
only to see its results ignored by the development team which can no longer
use the information provided.
- Consensus, not the actual truth, sets in. When people
cannot agree to the "one truth" they very often instead agree to disagree
and settle on a definition that really doesn't get the job done but at least
it doesn't offend anyone too much.
- The gap between the data group and the rest of IT widens yet again.
Extensive data modeling efforts such as this will often appear to be little more than
yet another political power grab by the data group. In combination
with the challenges resulting from the
cultural impedance mismatch, this merely proves to drive in another
wedge between the data group and the people whom they're supposed to be
Figure 1. Questioning the "One Truth" philosophy.
|"One truth" can be a nice vision to work toward in theory, but in
practice you'll likely only be able to narrow it down to several reasonably
similar truths. It may be important to recognize that there are several
truths and to identify those truths, but trying to force a single consistent
truth on all parties is futile at best. Don't let it prevent your
team from delivering important business value in a timely manner. My advice is directly related to this: Take a practical approach and
recognize that there is a diminishing rate of return when it comes to modeling,
and that you can quickly reach the inflection point where further investment in
data modeling reduces the overall value to your organization. Once again, the
failure rate of traditional data warehousing efforts speak for themselves.
- Recognize that the true goal is to deliver business value, not perfect data.
The "one truth above all else" anti-pattern often kicks in when people have
lost true sight of the overall goal which is to develop high-quality working
systems which meet the changing needs of their stakeholders. Data
modeling efforts often take on a life of their own, or perhaps it's really a
death march of their own, when the one truth becomes the primary goal.
- The "one truth" is a moving target, so embrace change.
Requirements change over time, sometimes because of changes to the
business and technical environment and often because your stakeholders
simply didn't understand what they wanted in the first place. The
implication is that at best the one truth is a destination which you are
always moving towards but one that you'll never actually reach, therefore
trying to get it perfectly right up front isn't realistic. Invest some
doing initial requirements envisioning, but recognize that there are swiftly diminishing
returns from modeling.
- Never rest. Expect entropy of your "truthful data" because data errors will creep into your source data.
An effective database regression testing strategy
will of course greatly reduce if not remove this problem, but few
organizations seem to have such a strategy in place (or even
realize that they need to do so).
- Be flexible defining semantics. Like it or not, there will
be a wide range of definitions and uses for the data within your
organization and that is perfectly ok. Language is imprecise --
although you should strive to clarify as much as possible the definition of
something you'll rarely be able to get a single, perfect answer. The implication is that you
should strive to identify the range of acceptable definitions, and hopefully
weed out some of the unacceptable ones. Your applications will need to focus
on handling the exceptions that are out of bounds when they occur.
- Adopt a federated view, not a unified one. Different groups
within your organization will have different definitions for data, different
ways to work with it, and different priorities. Instead of trying to
club them into submission by forcing a single truth upon them, instead try
to enable them by supporting different models for each different line
of business. I'm not saying that this is easy to do, but I am saying that
it's your only viable option in any reasonably complex domain.
- Look at the whole picture, not just data. As the
of the Agile Data method points out, data is only one of many important
aspects which you need to consider. Not only do you need to
rationalize your data, you also need to rationalize the
business logic too.
- Pick your battles wisely. Focus first on the data where inconsistencies
have the greatest impact on your organization. In other words, do an
informal risk assessment on a regular basis and prioritize
the work just as you would
business requirements on a software development project. Only through
prioritization such as this do you have any sort of hope of
maximizing stakeholder ROI within your organization.
- Adopt a agile/lean approach to data governance. It is possible to
have an effective, streamlined approach to
data governance which enables development teams to work with and produce
high-quality data assets.
I'd like to thank Curt Sampson and
Dawn Wolthuis for their
feedback regarding this article.