Agile Master Data Management (MDM)
AgileData.org: Techniques for Disciplined Agile Database Development
|The primary goals of Master Data Management (MDM) are to promote a shared foundation of common data definitions within your organization, to reduce data inconsistency within your organization, and to improve overall return on your IT investment. MDM, when it is done effectively, is an important supporting activity for service oriented architecture (SOA) at the enterprise level, for enterprise architecture in general, for data warehouse(DW)/business intelligence (BI) efforts, and for software development projects in general. Traditional approaches to data management (DM), particularly those based on extensive modeling and a serial approach to performing the work, have a poor track record in practice. MDM is likely to struggle if you do not move away from traditional DM strategies. In this article I show that agile software development strategies offer significant value for MDM efforts, strategies based on evolutionary development, collaborative approaches to working, and focusing on providing concrete value to the business.||
Agile software development (ASD) is an evolutionary approach which is collaborative and self-organizing in nature, producing high-quality systems that meets the changing needs of stakeholders in a cost effective and timely manner. MDM and ASD are clearly different things, although they are clearly compatible. An agile approach to MDM:
The main differences between "Agile MDM" and "traditional MDM" are centered on the approach to doing the work, not the fundamental work itself; In other words, when you do the work, how you do it, and who you do it with are the critical issues. An agile approach to MDM achieves the goals of MDM (promoting common data definitions, reducing data inconsistency, and improving IT ROI) by embedding MDM activities into the overall software process in a manner which reflects the environment of modern IT departments. The following basic MDM activities are still performed (if and when they make sense) with an agile approach, but as you'll see they're accomplished in a more effective and efficient manner:
Any agilist reading the above list is likely reeling from the potential for out-of-control bureaucracy surrounding MDM. Considering the past track record of most data management efforts, more on this in a bit, this is a significant concern. As you'll soon learn in the rest of this article it is in fact possible to streamline MDM efforts so that the value is achieved without the pain of needless bureaucracy, although as you would imagine this will require significant culture shifts in some organizations.
|The best way to deliver value is to work closely with development teams and their
stakeholders to ensure that the MDM effort is focused on supporting the creation
of business functionality that stakeholders actually need now, not at some
undefined point in the future. Traditional, documentation heavy,
command-and-control approaches to MDM are often doomed to failure because the
MDM program is too tedious for teams to follow. With a collaborative
approach to MDM:
It is very easy to claim that you intend to take a collaborative approach to MDM, but a lot harder to actually do so. Traditional data management has a poor track record of working together closely and effectively with development teams, as you see in Figure 1. This chart summarizes the results of two questions asked in Dr. Dobb's Journal (DDJ)'s 2006 State of Data Management Survey -- the first question asked whether development teams find the need to go around their organization's data group (the majority did) and the second asked why they did so. Interestingly, 25% of the problem was around simple education issues with developers (they need to know who to work with and when to do so) and 75% of the problem rested on the shoulders of the data group (people either found them too difficult to work with, too slow, or simply didn't offer sufficient value). The point is that if your development teams are currently frustrated with the level of service provided by your organization's data group then it will be that more difficult for the data group to make inroads into the teams to support any sort of MDM effort.
Figure 1. Reasons why development teams go around data groups.
If the MDM activities, particularly the ones involving work to identify and capture metadata, are separate from day-to-day development activities then there is very little chance of your MDM program succeeding. The easiest way to embedded MDM activities into your development process is to educate team members on the importance of MDM and to ensure that one or people have the appropriate skills to collaborate with the enterprise administrator(s) and enterprise architect(s) responsible for MDM efforts. If your team has one or more agile DBAs then MDM activities should be part of their daily jobs, and ideally they will have tools which automate as much if this work as possible.
The challenge is that development teams in general, and in particular agile teams with their focus on high-value activities, will be reticent to do this sort of data-oriented work if they perceive it as extraneous. Worse yet, few development methods explicitly include these sorts of activities, in part because the people behind the methods often lack experience in such activities but mostly because the data community struggles to make their techniques relevant to modern-day development.
MDM by definition must have an organization/enterprise-level view, and an agile approach to MDM is no exception. However, that doesn't mean that MDM has to be an onerous, command-and-control activity which does little more than justify the existence of your data management group for the year or two that they're able to milk MDM before it fails due to not producing measurable value. Instead, with a collaborative and lean approach your enterprise administrator(s) and enterprise architect(s) can achieve the stated goals of MDM in a sustainable way. Agile MDM is both a project-level and an enterprise-level activity, and the needs of these two levels will need to be balanced in a manner which reflects your unique situation.
The evidence that evolutionary, iterative and incremental, approaches to software development are superior to serial approaches has been mounting for years. This is true of data-oriented activities too, as this site clearly shows. Technically it is quite easy to take an evolutionary approach to IT activities, including data activities, but that often the true challenges prove to be around overcoming cultural challenges.
Not only is it possible to analyze legacy data sources, to collect metadata, and then support development teams in an evolutionary manner you really have no choice in the matter. This is obvious for several reasons:
With an evolutionary approach to MDM you want to work in priority order. This order should be set by the business not by the IT department. A common Agile strategy, exemplified in development methods such as Open Unified Process, Extreme Programming (XP), and Microsoft Solution Framework (MSF) for Agile, is have the stakeholders prioritize the work to be done, not the IT professionals. This strategy is depicted in Figure 2 and described in detail in Agile Requirements Change Management. This enables you to maximize return on investment (ROI) because you're always working on the most important functionality required by your stakeholders. Yes, your enterprise architecture and enterprise business modeling efforts will still guide your work, but this guidance will be reflected in the overall prioritization of the work.Figure 2. Agile requirements change management process.
This is probably the most radical advice which I present in this article – data is a secondary concern for MDM, not a primary one. An IBM study into CRM showed that the primary success factors for CRM were business-oriented and cultural in nature and not technical. Considering that MDM is arguably CRM applied to all major business concepts and not just customers we should really take heed of these findings. In other words, you must focus on usage, not on data.
With a usage-driven approach your major requirements artifacts explain how people will work with, or interact with, the system. Examples of such artifacts include use cases, user stories, and usage scenarios which are primary artifacts of OpenUP, XP, and MSF for Agile respectively. Business process models could also arguably be used here, but none of the major agile development methodologies use them as a primary artifact although Agile Modeling includes them as potential models which you should apply where appropriate. When these artifacts are created rigorously they often refer to other types of requirements, such as business rules and report specifications. However, these sorts of details are often explored on a just-in-time (JIT) model storming basis during the project so many agile teams won’t invest in rigorously documenting them because the useful lifetime of such documentation is very short.
The value in usage models, in particular use cases and usage scenarios, is that they focus on the business objectives which end users are trying to accomplish by using your system(s). If your stakeholders are able to prioritize the various usages, then suddenly development teams find themselves in the position of being able to not only deliver something of concrete value, the implementation of the various usages, but if they implement them in priority order then they will maximize stakeholder’s return on investment (ROI) in IT.
A common mistake which often leads to failure is to let technology decisions drive your prioritization strategies. For example, a favorite IT strategy is to work on one legacy system at a time, analyzing and then cataloging the metadata for the entire system. This sort of initial, detailed cataloging effort can take years to accomplish and will more than likely run out of steam long before any concrete results are produced. Another ill-fated strategy is to focus on specific data entities one at a time. Although this approach has more merit than the previous one, you may find that you need to do this for a large number of entities before you can start providing real business value from your efforts. The fundamental problem is that technical prioritization strategies do not reflect the priorities of the business which you are trying to support, putting any IT effort, including MDM efforts, at risk because your stakeholders aren’t receiving concrete value in a timely manner. When stakeholders don’t perceive the value that they’re getting for their IT investment they quickly start to rethink such investment.
Worse yet, some MDM efforts run aground on the “one truth” shoals – they strive to develop one definition for each data entity within an organization. In theory this is a laudable goal but in practice it’s virtually impossible because few organizations can actually come to an agreement on the definitions of major concepts. Furthermore, it’s often a competitive advantage for your organization to treat various concepts differently at times based on the given context. A wonderful example of this is HSBC’s series of billboard and airport advertisements around the world showing two different pictures with captions, then showing the same two pictures with the captions swapped. Figure 3 is a picture that I took in a hallway in London's Heathrow airport. In short, efforts to try to identify the “one truth” are likely misguided and unlikely to actually produce value. My advice is to worry less about gathering perfect metadata and instead focus on delivering valuable business functionality.
Figure 3. Questioning the "One Truth" philosophy.
Many traditional IT efforts find themselves in trouble when they take a document-based approach to reporting progress. For example, in earned value management (EVM) you claim progress against your plan when you achieve various milestones called out in those plans. On traditional software development projects these milestones are typically based on delivery of key documentation such requirements specifications, design specifications, test plans, and eventually the working system. Traditional MDM efforts may choose to measure earned value in terms of the metadata collected, such as the number of entity types or entity attributes defined. The challenge to a document-based approach to measuring earned value is that there is a tenuous relationship between documentation and actual delivery of working functionality which actually provides real value to business stakeholders. When you think about it, you’re doing little more than justifying bureaucracy with document-based EVM.
Agile teams “earned value” in the form of a working solution, which for a software development project is the delivery of working software and for a DW/BI project the delivery of analytic data and supporting reports. Therefore, with an agile approach to MDM your focus shouldn’t be on collecting metadata (although you will still do that) but instead should be on:
In other words, don’t do MDM for the sake of doing MDM, instead do it to streamline stakeholder-facing data-oriented activities. The only valid way of measuring your MDM efforts isn’t by number of data elements collected but instead by number of “data conformant” reports, data conformant web services, or data conformant components delivered by project teams.
Agile software development teams work in priority order, as you saw in Figure 2, and thereby they maximize stakeholder return on investment (ROI) by focusing on delivering the highest value functionality at any given time. If all of your development teams work in this manner, and because agile MDM work is embedded in the development process, you similarly will maximize the ROI on your MDM efforts.
This differs from traditional MDM efforts which try to capture the required metadata in a "big modeling up front (BMUF)" style effort. This is often in the form of a multi-month if not multi-year effort run by a DM project team in parallel to actual software development projects. There are several problems with the traditional approach to MDM:
Agile software developers typically take a test-first approach to development, also called test-driven development (TDD) or behavior driven development (BDD), and this is not only possible for data professionals it is highly desirable. With a test-driven approach you write a single test before doing the work to fulfill that test, in effect creating a detailed specification for that functionality before implementing it. Better still, you can run the tests on a regular basis and thereby validate your work in progress. A test-first approach, in combination with other agile testing activities, greatly increases the quality of the work delivered. This shouldn’t come as a surprise – testing as early as you possibly can, and fixing the defects that you do find, and doing so more often, leads to improved quality.
Traditional teams often take a review-based approach to development, particularly early in the lifecycle when they have no software to work with. Although better than doing nothing at all, reviews prove ineffective in practice when compared with regression testing when it comes to quality. Reviews have a very long feedback cycle, often weeks if not months, and as a result the costs of addressing defects are much higher than techniques (such as TDD) with shorter feedback cycles. If someone can offer actual value in a review, why not have them involved with the actual work to begin with? In short, reviews often seem to be a stop-gap measure which compensate for poor collaboration or lack of quality focus earlier in the lifecycle. It is far better to address the real problem, hopefully with Agile strategies, than to simply put a band-aid over it and hope for the best. And the numbers clearly show that traditional approaches to data quality are failing in practice – The Data Warehouse Institute (TDWI) reports that data quality problems result in a loss of over $600 Billion annually in the United States.
Traditional governance often focuses on command-and-control strategies which strive to manage and direct development project teams in an explicit manner. This approach is akin to herding cats because you'll put a lot of work into the governance effort but achieve very little in practice. Agile/lean data governance focuses on collaborative strategies that strive to enable and motivate team members implicitly. This is akin to leading cats – if you grab a piece of raw fish, cats will follow you wherever you want to go.
An important component of data management is governance of the MDM metadata and of the source data which it represents. My experience is that a traditional, command-and-control approach where the DM group “owns” the data assets within your organization and has a “death-lock” on your databases proves dysfunctional in practice. At best it results in the DM group becoming a bottleneck within your IT department and at worst it results in the development teams going around the DM group in order to get their work done, effectively negating your data governance efforts (some alarming statistics on this in a minute). A better approach is to:
The real challenges with MDM have nothing to do with technology but instead with people. In many organizations there is a significant cultural impedance mismatch that you need to overcome between the data management group and the development teams. This will take time. This mismatch was revealed in the results of the IBM survey into CRM as well as a data management survey performed by Dr. Dobb’s Journal in the Fall of 2006. The survey found that 66% of respondents indicated the need to go around their data groups at time, and that of those people 75% indicated that they did so because the data groups were too slow to respond to their requests, provided too little real value to the development teams, or were simply too difficult to work with.
The data community must recognize that we can do better than the traditional strategy for MDM, and for data management in general. Although many data professionals prefer traditional, documentation-heavy approaches they must recognize that the rest of the IT community has moved on and have adopted more effective ways of working. An Agile approach to MDM is more effective than a traditional approach, for several reasons:
Master Data Management (MDM), when implemented correctly, can provide significant value to your organization. Unfortunately, our track record with similar efforts in the past, in particular Customer Relationship Management (CRM) and metadata repositories before that, were less than ideal. I believe that you will greatly increase your chance of success by apply agile techniques such as working in an evolutionary manner, taking a usage-driven approach, focusing on measurable results, working collaboratively, delivering quality through testing, and adopting a lean approach to data governance.
We actively work with clients around the world to improve their information technology (IT) practices, typically in the role of mentor/coach, team lead, or trainer. A full description of what we do, and how to contact us, can be found at Scott W. Ambler + Associates.
This site owned by Ambysoft Inc.