It is well known that there is a technical impedance
mismatch between
object-oriented technology
and
relational
database technology. It is also well known, although not as
well recognized, that there is a cultural impedance mismatch (something that I
used to call the
object-data divide)
which refers to the politics between the developer community and the data
community.
Specifically, these politics are the difficulties of
developers and data professionals experience when working together,
and generally to the dysfunctional politics between the two communities that
occurs within IT organizations and even the IT industry itself.
Worse yet, this impedance mismatch has become even more pronounced
between the agile and data communities. This article discusses:
- How to recognize
that you have a problem
- The extent of the
problem
- How the cultural
impedance mismatch came about
- Addressing the
cultural impedance mismatch
Symptoms of the cultural impedance mismatch include:
-
Application developers that claim
relational technology either shouldn't or can't be used to store objects
-
Data professionals that claim that your
object/component models must be driven
by their data models
-
Application developers that claim that
because they're using a
persistence framework they don't need to understand anything about the
underlying data technology
-
Data professionals that disparage agile
software development approaches, yet when pressed know very little about
agile
-
Application developers complain about the "useless
data bureaucracy" without understanding why the data activities have
been adopted
-
Data professionals complain about the data messes
created by application developers (yet they rarely seem to want to train
developers to do the job right)
A July 2006
survey into the current state of data management indicates that 66% of
respondents indicated that development teams sometimes go around their data
management (DM) groups. The reasons given are summarized in
Figure 1. As you can see, one quarter of the
problem seems to be attributable to developers because 8% didn't know that the
data group existed and 17% didn't know that they needed to work with them.
Luckily these problems can easily be addressed through education first and
improved
governance second. The remaining three quarters of the problem,
attributable to data professionals, will prove to be more difficult to address.
20% of development teams felt that their data group was too difficult to work
with, 36% believed that the data group was too slow to respond to their
requests, and 19% felt the data group offered too little value. The
implication is that data professionals will need to change the way that they
work, very likely adopting
new
techniques as well as
new
philosophies.
Figure 1. Reasons why development teams go around data groups.

 |
This reluctance of development teams to interact effectively, if at all, with
data management groups exacerbates the significant
data quality problems in modern organizations. The development teams
are unable to leverage the knowledge, skills, and experiences of the data
professionals available to them and thereby make common mistakes such as
creating new data sources instead of taking advantage of existing data sources,
creating a substandard database design due to lack of skills in this area, and
not conforming to corporate data naming and metadata conventions. The data
management group also suffers because they're unaware of the new database
development strategies the agile community (such as
database regression testing,
database refactoring, and continuous integration) and thereby miss out on
concrete, quality-focused techniques. |
To understand why our industry suffers from the object-data
divide you need to consider the history of the information technology industry,
see Figure 2 for a timeline.
Object technology was first introduced in the late 1960s and adopted by
the business community in the late 1980s and early 1990s, marking what I
consider to be the first point of divergence.
Up until then the data and developer communities pretty much worked to the same
set of philosophies and strategies -- it wasn't perfect, but at least the two
groups were reasonably in sync with each other.
Figure 2. A timeline of the divergence between the
data and development communities.
But the object revolution
motivated a cultural gap between the two communities that has existed every
since.
As with most other new technologies, there was spectacular hype
surrounding objects at the start: Everything
is an object. Object technology is
a silver bullet that solves all of our problems.
Objects are easier to understand and to work with.
Object technology is the only thing that you'll ever need.
In time reality prevailed and these claims were seen for what they were,
wishful thinking at best. Unfortunately
one bit of hype did serious damage, the idea that the pure approach supported by
objectbases would quickly eclipse the "questionable" use of relational
technologies. This mistaken belief,
combined with the findings of several significant research studies that showed
that object techniques and structured techniques don't mix well in practice, led
many within the object community to proclaim that objects and relational
databases shouldn't be used together.
At the same time the data community was coming into its
own. Already important in the
traditional mainframe world, data modelers found their role in the two-tier
client server world (the dominant technology at the time for new application
development) to be equally as critical. Development
in both of these worlds worked similarly: the data professionals would develop
the data schema and the application developers would write their program code.
This worked because there wasn't a lot of conceptual overlap between the
two tasks, data models showed the data entities and their relationships whereas
the application/process models showed how the application worked with the data.
From the point of view of data professionals very little had changed in
their world.
Then object technology came along.
Some data professionals quickly recognized that the object
paradigm was a completely new way to develop software, I was among them, and
joined the growing object crowd. Unfortunately
many data professionals either believed the object paradigm to be another fad
doomed to fail or merely another programming technology and therefore remained
content with what they perceived to be the status quo.
Unfortunately both communities got it wrong.
To the dismay of object purists, objectbases never proved to be more than
a niche technology, whereas relational databases have effectively become the
defacto standard for storing data. Furthermore,
the studies of the late 80s and early 90s actually showed that you shouldn't use
structured models for object implementation languages such as C++ or Smalltalk,
or object models for structured implementation languages such as COBOL or BASIC
(apparently, it's smart to
apply the right artifact(s) for the situation).
They didn't address the idea of melding object and structured modeling
techniques in order to drive your efforts working with implementation
technologies such as object programming languages and relational databases.
In fact, practice has shown that it is reasonably straightforward to
map
objects to relational databases.
When it came to process there was a significant difference
between the two communities. Throughout the 1990s the majority of new
software development would use object and component-based technology and follow
evolutionary processes in the mold of Barry Boehm’s spiral lifecycle. Where the
data community for the most part stuck with what they knew to be tried and true,
the developer community started experimenting with new techniques and
technologies, pushing the software process boundaries. Yet, although many
promises where made and many case studies written, and even though modeling
languages unified under a single banner, the productivity gains proved to be
much smaller than expected. Then, in 2001, seventeen thought leaders from
the object community decided to go skiing. In the evening they gathered to
discuss what works in practice rather than what they’d been told should work in
theory when it comes to building systems. Surprisingly they actually agreed on
a set of values and principles which were captured the publication of the
Agile Manifesto–the
second divergence occurred. Evolutionary development was good, but doing so in
a highly collaborative and quality-driven manner was even better. The chasm
between data professionals and developers was growing even wider, and the agile
philosophies and techniques actually provided the productivity gains which had
been promised in the first age of divergence. Agile teams are now achieving
measurably higher success rates than traditional application development teams
and data warehousing projects,
72% compared with
63% and 63% respectively, calling into question the approaches preferred
by the traditional data community.
The object approach had superceded the data approach, in fact there was
such a significant conceptual overlap that many data professionals mistakenly
believed that class diagrams were merely data models with operations added in
because they hadn’t recognized the subtle differences.
What they didn't recognize is that the complexity of modeling behavior
requires more than just class diagrams, hence the wealth of models defined by
the UML, and that their focus on data alone was too narrow for the needs of
modern application development. Object
techniques proved to work well in practice, not only isn't object technology a
fad it has become the dominant development platform, and the status quo has
changed to the point that most modern development methodologies devote no more than
a few pages to data modeling (to their detriment). This of course is not the case with the
Disciplined Agile Delivery (DAD)
framework which builds agile database strategies right into its approach.
The good news, as you can see in Figure 2, is that the data
community is starting to adopt
agile data warehousing strategies. I suspect that these
strategies will take many years to catch on, but it's clear that this adoption is now underway.
Overcoming the cultural impedance mismatch is much more difficult
than overcoming the
technical
impedance mismatch.
Some strategies to help you do so:
-
Everyone needs to recognize that the problem exists and needs to be
overcome.
Developers and data professionals have different skills,
different backgrounds, different philosophies, and different ways that they
prefer to work. Instead of finding
ways to work together that takes advantages of these differences, many IT shops
instead have chosen to erect communication and political barriers between the
two groups of professionals. These
barriers must be removed, something that the adoption of the
Agile Data (AD) method
can help with.
-
Recognize that one process size does not fit all.
Different
projects require different approaches and you need to manage accordingly.
A
data warehousing project is different than a web-site development
project. A team of three people will work differently than a team of
thirty, or three hundred. A team that is co-located will work
differently than a distributed team. A team working in a regulatory
environment will work differently than one that does not. A team
working with
legacy systems will work differently than a team which is developing a
greenfield system. It
isn’t sufficient for the data group to be right, or the application group to
be right, they need to be right together.
We need to stop
playing political games and instead find ways to work together.
-
Recognize that we need to consider the entire architectural
picture. Too many
application
developers struggle to understand the fundamentals of data technology.
Then again, too many data professionals struggle to understand
architectural concepts beyond the narrow confines of the data realm.
-
Train
developers in data skills.
-
Train
data professionals in development skills.
-
Adopt an evolutionary, or better yet agile, approach to
data administration and to
data
architecture.
-
Adopt an
agile/lean
approach to data governance.
|
|