Agile Data

The "One Truth Above All Else" Anti-Pattern

AgileData.org: Techniques for Disciplined Agile Database Development

Scott Ambler + Associates
   Home  |  Agile DBAs  |  Developers  |  Enterprise Architects  |  Enterprise Administrators  |  Best Practices  |  Agility@Scale Blog  |  Announcements  |  Contact Us 
Recently reviewed

The "one truth" philosophy says that it is desirable to have a single definition for a data element or business term, that there should be a common, shared definition for your master reference data and perhaps even your major business entities.  The "One Truth Above All Else" anti-pattern occurs when this philosophy is taken to the extreme and you seek to get to the one truth about all data entities and data elements within your environment.  The challenge is that to get to the "one truth" about something, when it is even possible, often requires significant effort.  When this effort goes beyond the point of diminishing returns and provides negative value to your organization you have a significant problem on your hands. 

 

 

Seeking the one truth is often an integral part of a master data management (MDM) strategy where you strive to use the most correct and current data available to you.  To achieve this you need to manage both the data itself by information about the semantics, source, and quality of the data.  This information is often referred to as metadata.  In the past MDM was thought of as primarily an issue for data warehouse (DW)/business intelligence (BI) efforts, but with the growing importance of service oriented architecture (SOA) this is clearly important for mission-critical real-time systems as well.

The goal of MDM is to rationalize the data stored in disparate systems, a difficult but worthy task.  Business stakeholders can benefit from consistent data.  Mergers and acquisitions often fail when data cannot be reconciled.  Support calls concerning data quality problems are avoided when the source problems are actually addressed.

 

The Problem

"Analysis paralysis" occurs when the team, or more often the data professionals on the team, discover that it's incredibly difficult and time consuming to get people to agree to the "one truth" about their data.  This seems to be particularly true of DW/BI projects, perhaps because they touch so many data sources that they very clearly see that there are many versions or interpretations of the same basic concepts.  Sadly, experience seems to show that organizations are spending significant amounts of effort seeking the one truth yet earning little or often negative return on investment (ROI) for that effort.  This is because the project teams get hung up trying to get the data perfect instead of focusing on delivering the high-quality working software which provides value to it stakeholders.

 

The Implications

Taken in moderation attempting to seek the one truth can provide value, but when you take it to the extreme several negative side effects occur:

  1. Your organization's competitive position can be eroded by enforcing a consistent viewpoint. The fact is that various portions of your organization have different ways of working, different priorities, and different constraints. There may not be one single shared truth, and even if there were, it's going to change over time anyway. A great example of this are HSBC advertisements which show two similar pictures and below each picture a different word (Figure 1 is a picture that I took in a hallway in London's Heathrow airport). Then the ad shows the same two pictures again, but with the words switched. The point is that people see the world differently, and that as a financial institution, they understand that and are flexible enough to act accordingly.
  2. The development team abandons the effort.  Modern development is evolutionary, not serial, in nature.  The rest of IT doesn't have the time to wait for the data professionals to get to the "one truth" before continuing with actual development.  If getting to the one truth impacts the project timeline then the development team will often choose to continue on without the data professionals.  Sometimes the data modeling effort is abandoned, but more often it will continue in parallel only to see its results ignored by the development team which can no longer use the information provided.
  3. Consensus, not the actual truth, sets in.  When people cannot agree to the "one truth" they very often instead agree to disagree and settle on a definition that really doesn't get the job done but at least it doesn't offend anyone too much. 
  4. The gap between the data group and the rest of IT widens yet again.  Extensive data modeling efforts such as this will often appear to be little more than yet another political power grab by the data group.  In combination with the challenges resulting from the cultural impedance mismatch, this merely proves to drive in another wedge between the data group and the people whom they're supposed to be supporting.

Figure 1. Questioning the "One Truth" philosophy.

 

The Solution

"One truth" can be a nice vision to work toward in theory, but in practice you'll likely only be able to narrow it down to several reasonably similar truths.  It may be important to recognize that there are several truths and to identify those truths, but trying to force a single consistent truth on all parties is futile at best.  Don't let it prevent your team from delivering important business value in a timely manner.  My advice is directly related to this: Take a practical approach and recognize that there is a diminishing rate of return when it comes to modeling, and that you can quickly reach the inflection point where further investment in data modeling reduces the overall value to your organization. Once again, the failure rate of traditional data warehousing efforts speak for themselves. Agile Database Techniques
  1. Recognize that the true goal is to deliver business value, not perfect data. The "one truth above all else" anti-pattern often kicks in when people have lost true sight of the overall goal which is to develop high-quality working systems which meet the changing needs of their stakeholders.  Data modeling efforts often take on a life of their own, or perhaps it's really a death march of their own, when the one truth becomes the primary goal.
  2. The "one truth" is a moving target, so embrace changeRequirements change over time, sometimes because of changes to the business and technical environment and often because your stakeholders simply didn't understand what they wanted in the first place.  The implication is that at best the one truth is a destination which you are always moving towards but one that you'll never actually reach, therefore trying to get it perfectly right up front isn't realistic.  Invest some time doing initial requirements envisioning, but recognize that there are swiftly diminishing returns from modeling.
  3. Never rest.  Expect entropy of your "truthful data" because data errors will creep into your source data.  An effective database regression testing strategy will of course greatly reduce if not remove this problem, but few organizations seem to have such a strategy in place (or even realize that they need to do so). 
  4. Be flexible defining semantics.  Like it or not, there will be a wide range of definitions and uses for the data within your organization and that is perfectly ok.  Language is imprecise -- although you should strive to clarify as much as possible the definition of something you'll rarely be able to get a single, perfect answer. The implication is that you should strive to identify the range of acceptable definitions, and hopefully weed out some of the unacceptable ones.  Your applications will need to focus on handling the exceptions that are out of bounds when they occur.
  5. Adopt a federated view, not a unified one.  Different groups within your organization will have different definitions for data, different ways to work with it, and different priorities.  Instead of trying to club them into submission by forcing a single truth upon them, instead try to enable them by supporting different models for each different line of business.  I'm not saying that this is easy to do, but I am saying that it's your only viable option in any reasonably complex domain.
  6. Look at the whole picture, not just data.  As the first philosophy of the Agile Data method points out, data is only one of many important aspects which you need to consider.  Not only do you need to rationalize your data, you also need to rationalize the business logic too.
  7. Pick your battles wisely.  Focus first on the data where inconsistencies have the greatest impact on your organization.  In other words, do an informal risk assessment on a regular basis and prioritize the work just as you would prioritize business requirements on a software development project.  Only through prioritization such as this do you have any sort of hope of maximizing stakeholder ROI within your organization.
  8. Adopt a agile/lean approach to data governance.  It is possible to have an effective, streamlined approach to data governance which enables development teams to work with and produce high-quality data assets.

 

Acknowledgements

I'd like to thank Curt Sampson and Dawn Wolthuis for their feedback regarding this article.

 

Suggested Online Readings

Agile Database Techniques This book describes the philosophies and skills required for developers and database administrators to work together effectively on project teams following evolutionary software processes such as Extreme Programming (XP), the Rational Unified Process (RUP), the Agile Unified Process (AUP), Feature Driven Development (FDD), Dynamic System Development Method (DSDM), or The Enterprise Unified Process (EUP).  In March 2004 it won a Jolt Productivity award.
Refactoring Databases

This book describes, in detail, how to refactor a database schema to improve its design. The first section of the book overviews the fundamentals evolutionary database techniques in general and of database refactoring in detail.  More importantly it presents strategies for implementing and deploying database refactorings, in the context of both "simple" single application databases and in "complex" multi-application databases.  The second section, the majority of the book, is a database refactoring reference catalog.  It describes over 60 database refactorings, presenting data models overviewing each refactoring and the code to implement it.

 

The Object Primer 3rd Edition: Agile Model Driven Development (AMDD) with UML 2 This book presents a full-lifecycle, agile model driven development (AMDD) approach to software development.  It is one of the few books which covers both object-oriented and data-oriented development in a comprehensive and coherent manner.  Techniques the book covers include Agile Modeling (AM), Full Lifecycle Object-Oriented Testing (FLOOT), over 30 modeling techniques, agile database techniques, refactoring, and test driven development (TDD).  If you want to gain the skills required to build mission-critical applications in an agile manner, this is the book for you.
 

 

 

Let Us Help

We actively work with clients around the world to improve their information technology (IT) practices, typically in the role of mentor/coach, team lead, or trainer.  A full description of what we do, and how to contact us, can be found at Scott W. Ambler + Associates.

 


Disciplined Agile Delivery: The Foundation for Scaling Agile Agile Modeling: Practices for Scaling Agile Agile Data: Practices for Scaling Agile EnterpriseUP: Agility at Scale AgileUP: Towards Disciplined Agile DeliveryAmbysoft Inc. Software Development Practices Advisor Scott Ambler + Associates Follow @scottwambler on Twitter!


Copyright © 2002-2012 Scott W. Ambler

This site owned by Ambysoft Inc.