Agile Data

The Cultural Impedance Mismatch Between Data Professionals and Application Developers

Follow @scottwambler on Twitter!

It is well known that there is a technical impedance mismatch between object-oriented technology and relational database technology. It is also well known, although not as well recognized, that there is a cultural impedance mismatch (something that I used to call the object-data divide) which refers to the politics between the object community and the data community.  Specifically, these politics are the difficulties of object-oriented and data-oriented developers experience when working together, and generally to the dysfunctional politics between the two communities that occurs within IT organizations and even the IT industry itself.  Worse yet, this impedance mismatch has become even more pronounced between the agile and data communities. This article discusses:
  1. How to recognize that you have a problem
  2. The extent of the problem
  3. How the cultural impedance mismatch came about
  4. Addressing the cultural impedance mismatch

1. How to Recognize That You Have a Problem

Symptoms of the cultural impedance mismatch include:
  • Application developers that claim relational technology either shouldn't or can't be used to store objects
  • Data professionals that claim that your object/component models must be driven by their data models
  • Application developers that claim that because they're using a persistence framework they don't need to understand anything about the underlying data technology
  • Data professionals that disparage agile software development approaches, yet when pressed know very little about agile
  • Application developers complain about the "useless data bureaucracy" without understanding why the data activities have been adopted
  • Data professionals complain about the data messes created by application developers (yet they rarely seem to want to train developers to do the job right)

2. The Extent of the Problem

A July 2006 survey into the current state of data management indicates that 66% of respondents indicated that development teams sometimes go around their data management (DM) groups. The reasons given are summarized in Figure 1. As you can see, one quarter of the problem seems to be attributable to developers because 8% didn't know that the data group existed and 17% didn't know that they needed to work with them.  Luckily these problems can easily be addressed through education first and improved governance second. The remaining three quarters of the problem, attributable to data professionals, will prove to be more difficult to address.  20% of development teams felt that their data group was too difficult to work with, 36% believed that the data group was too slow to respond to their requests, and 19% felt the data group offered too little value. The implication is that data professionals will need to change the way that they work, very likely adopting new techniques as well as new philosophies.

Figure 1. Reasons why development teams go around data groups.


Beautiful Teams This reluctance of development teams to interact effectively, if at all, with data management groups exacerbates the significant data quality problems in modern organizations. The development teams are unable to leverage the knowledge, skills, and experiences of the data professionals available to them and thereby make common mistakes such as creating new data sources instead of taking advantage of existing data sources, creating a substandard database design due to lack of skills in this area, and not conforming to corporate data naming and metadata conventions. The data management group also suffers because they're unaware of the new database development strategies the agile community (such as database regression testing, database refactoring, and continuous integration) and thereby miss out on concrete, quality-focused techniques.

 

3. How The Cultural Impedance Mismatch Came About

To understand why our industry suffers from the object-data divide you need to consider the history of the information technology industry, see Figure 2 for a timeline.  Object technology was first introduced in the late 1960s and adopted by the business community in the late 1980s and early 1990s, marking what I consider to be the first point of divergence.  Up until then the data and developer communities pretty much worked to the same set of philosophies and strategies -- it wasn't perfect, but at least the two groups were reasonably in sync with each other. 

Figure 2. A timeline of the divergence between the data and development communities.

But the object revolution motivated a cultural gap between the two communities that has existed every since.  As with most other new technologies, there was spectacular hype surrounding objects at the start: Everything is an object. Object technology is a silver bullet that solves all of our problems.  Objects are easier to understand and to work with.  Object technology is the only thing that you'll ever need.  In time reality prevailed and these claims were seen for what they were, wishful thinking at best. Unfortunately one bit of hype did serious damage, the idea that the pure approach supported by objectbases would quickly eclipse the "questionable" use of relational technologies. This mistaken belief, combined with the findings of several significant research studies that showed that object techniques and structured techniques don't mix well in practice, led many within the object community to proclaim that objects and relational databases shouldn't be used together.

At the same time the data community was coming into its own. Already important in the traditional mainframe world, data modelers found their role in the two-tier client server world (the dominant technology at the time for new application development) to be equally as critical. Development in both of these worlds worked similarly: the data professionals would develop the data schema and the application developers would write their program code.  This worked because there wasn't a lot of conceptual overlap between the two tasks, data models showed the data entities and their relationships whereas the application/process models showed how the application worked with the data.  From the point of view of data professionals very little had changed in their world.  Then object technology came along.  Some data professionals quickly recognized that the object paradigm was a completely new way to develop software, I was among them, and joined the growing object crowd.  Unfortunately many data professionals either believed the object paradigm to be another fad doomed to fail or merely another programming technology and therefore remained content with what they perceived to be the status quo.

Unfortunately both communities got it wrong.  To the dismay of object purists, objectbases never proved to be more than a niche technology, whereas relational databases have effectively become the defacto standard for storing data. Furthermore, the studies of the late 80s and early 90s actually showed that you shouldn't use structured models for object implementation languages such as C++ or Smalltalk, or object models for structured implementation languages such as COBOL or BASIC (apparently, it's smart to apply the right artifact(s) for the situation).  They didn't address the idea of melding object and structured modeling techniques in order to drive your efforts working with implementation technologies such as object programming languages and relational databases.  In fact, practice has shown that it is reasonably straightforward to map objects to relational databases

When it came to process there was a significant difference between the two communities. Throughout the 1990s the majority of new software development would use object and component-based technology and follow evolutionary processes in the mold of Barry Boehm’s spiral lifecycle. Where the data community for the most part stuck with what they knew to be tried and true, the developer community started experimenting with new techniques and technologies, pushing the software process boundaries. Yet, although many promises where made and many case studies written, and even though modeling languages unified under a single banner, the productivity gains proved to be much smaller than expected. Then, in 2001, seventeen thought leaders from the object community decided to go skiing. In the evening they gathered to discuss what works in practice rather than what they’d been told should work in theory when it comes to building systems. Surprisingly they actually agreed on a set of values and principles which were captured the publication of the Agile Manifesto–the second divergence occurred. Evolutionary development was good, but doing so in a highly collaborative and quality-driven manner was even better. The chasm between data professionals and developers was growing even wider, and the agile philosophies and techniques actually provided the productivity gains which had been promised in the first age of divergence.  Agile teams are now achieving measurably higher success rates than traditional application development teams and data warehousing projects, 71.5% compared with 62.8% and 62.6% respectively, calling into question the approaches preferred by the traditional data community.

To the dismay of data professionals, object modeling techniques, particularly those of the Unified Modeling Language (UML), are significantly more robust than data modeling techniques and are arguably a superset of data modeling (Muller 1999).  The object approach had superceded the data approach, in fact there was such a significant conceptual overlap that many data professionals mistakenly believed that class diagrams were merely data models with operations added in because they hadn’t recognized the subtle differences.  What they didn't recognize is that the complexity of modeling behavior requires more than just class diagrams, hence the wealth of models defined by the UML, and that their focus on data alone was too narrow for the needs of modern application development. Object techniques proved to work well in practice, not only isn't object technology a fad it has become the dominant development platform, and the status quo has changed to the point that most modern development methodologies devote no more than a few pages to data modeling (to their detriment).

 

4. Addressing the Cultural Impedance Mismatch

Overcoming the cultural impedance mismatch is much more difficult than overcoming the technical impedance mismatch.  Some strategies to help you do so:

  • Everyone needs to recognize that the problem exists and needs to be overcome.  Developers and data professionals have different skills, different backgrounds, different philosophies, and different ways that they prefer to work. Instead of finding ways to work together that takes advantages of these differences, many IT shops instead have chosen to erect communication and political barriers between the two groups of professionals. These barriers must be removed, something that the adoption of the Agile Data (AD) method can help with. 
  • Recognize that one process size does not fit all.  Different projects require different approaches and you need to manage accordingly.  A data warehousing project is different than a web-site development project. A team of three people will work differently than a team of thirty, or three hundred. A team that is co-located will work differently than a distributed team. A team working in a regulatory environment will work differently than one that does not. A team working with legacy systems will work differently than a team which is developing a greenfield system. It isn’t sufficient for the data group to be right, or the application group to be right, they need to be right together.  We need to stop playing political games and instead find ways to work together.
  • Recognize that we need to consider the entire architectural picture. Too many application developers struggle to understand the fundamentals of data technology.  Then again, too many data professionals struggle to understand architectural concepts beyond the narrow confines of the data realm.
  • Train developers in data skills.
  • Train data professionals in development skills.
  • Adopt an evolutionary, or better yet agile, approach to data administration and to data architecture.
  • Adopt an agile/lean approach to data governance.