Agile Data

Agile Database Tools and Scripts

Follow @scottwambler on Twitter!

To implement the Agile Data method within your organization you will need to adopt, build, and/or modify a collection of tools.  Tools are just a start, you also need an effective technical environment in which to use them.  This environment should be comprised of several “sandboxes" in which you will work. Finally, Agile DBAs will discover that they need to several different types of scripts to support their development efforts. 

1. Tools

Having an effective toolset is a critical success factor for any software development effort. Table 1 lists categories of tools, the target audience for the tool, how you would use the tool, and links to a representative sample of such tools.  Chances are very good that you already have many of these tools in house, although you will undoubtedly need to obtain several of them.

 

Agile isn't in the tool, it's in the way that you use the tool.

 

Table 2 lists tools that to my knowledge do not exist yet, at least at the time of this writing, that are needed to support the Agile Data method. My hope that we will see both commercial and open source tools available in the near future.

 

Table 1. Potential Tools That Support Agile Data Efforts.

Tool Category

Role

Purpose

Examples

CASE Tool – Development Modeling

Application Developer, Agile DBA

To support your application development efforts.

CASE Tool – Enterprise Modeling

Enterprise Architect

To define and manage your enterprise models.  

CASE Tool – Physical Data Modeling

Agile DBA

To define and manage your physical database schema.  Many data modeling tools support the generation and deployment of DDL code, making it easier to change your database schema.  And they also produce visual representations of your schema and support your documentation efforts.

Configuration Management

Everyone

You need to place all data definition language (DDL), source code, models, scripts, documents, … under version control.

Database Refactoring Tools Application Developer, Agile DBA To evolve your database schema in small, safe steps.
Development IDE/Refactoring Browser

Application Developer, Agile DBA

To support your programming and testing efforts.

Extract Transform Load (ETL)

Agile DBA,

Enterprise Administrator

ETL tools can automate your data cleansing and migrating efforts that evolve your database schema.

Persistence Frameworks

Application Developer, Agile DBA

Persistence frameworks/layers  encapsulate your database schema, minimizing the chance that database refactorings will force code refactorings external applications.

Release Tools Application Developer, Agile DBA You need to deploy your database between sandboxes, including production.
Test Data Generator

Application Developer, Agile DBA

Developers need test data against which to validate their systems. Test data generators can be particularly useful when you need large amounts of data, perhaps for stress and load testing.

Testing tools for load testing, user interface testing, system testing, …

Application Developer,

Agile DBA

You will need to go beyond unit testing to perform a more robust set of tests that go beyond unit testing.  The Full Lifecycle Object-Oriented Testing (FLOOT) method which encapsulates a wide range of traditional and agile testing techniques.

Traceability Management/ Repository

Everyone

Unit testing tools for your applications

Application Developer

Developers must be able to unit test their work, and to support iterative development they must be able to easily regression test.

Unit testing tools for your database

Agile DBA

Whenever you change your database schema, perhaps as the result of a database refactoring, you must be able to regression test your database to ensure that it still works.

Other Agile DBA  
 

Table 2. Future Tools.

Tool Category

Discussion

Automated Schema Traceability Management Tools

Although Table 1 includes traceability management tools the reality is that most tools are geared either towards requirements traceability or data access traceability (as in the case of repositories such as Rochade and Advantage).  Neither are suited for the fine-grained traceability required for database refactoring. Ideally you need a tool that can trace a wide range of application features, such as COBOL procedures and Java operations, to database features such as stored procedures and table columns.  Because of the complexity of this task the less manual intervention the better – ideally it should be able to parse your application and database code and create the traceability matrix automatically.

 


2. Sandboxes

This section has been replaced by the Sandboxes Best Practice article.


3. Scripts

Agile Database Techniques Pramod Sadalage and Peter Schuh (2002) suggest that Agile DBAs maintain what they call a database change log and an update log, the minimum that you require for simple stovepipe projects where a single application accesses your database. However, to support more complex environments where many applications access the your database you also require a data migration log.  Let’s explore how you use each log:

  1. Database change log.  This log contains the data definition language (DDL) source code that implements all database schema changes in the order that they were applied throughout the course of a project. This includes structural changes such as adding, dropping, renaming, or modifying things such as tables, views, columns, and indices. 
  2. Update log.  This log contains the source code for future changes to the database schema that are to be run after the deprecation period for database changes.  The Process of Database Refactoring argues that changing your database schema is inherently more difficult than changing application source code – other developers on your project team need time to update their own code and worse yet other applications may access your database and therefore need to be modified and deployed as well. Therefore you will find that you need to maintain both the original and changed portions of your schema, as well as any scaffolding code to keep your data in sync, for a period of time called the “deprecation period." 
  3. Data migration log. This log contains the data manipulation language (DML) to reformat or cleanse the source data throughout the course of your project.  You may choose to implement these changes using data cleansing utilities, often the heart of extract-transform-load (ETL) tools, examples of which are listed in Table 1.
You may choose to implement each logical script as a collection of physical scripts, perhaps one for each development iteration or even one for each individual database refactoring, or you may choose to implement as a single script that includes the ability to run only a portion of the changes. You need to be able to apply subsets of your changes to be able to put your database schemas into known states.  For example you may find yourself in development iteration 10 to discover that you want to roll back your schema to the way it was at the beginning of iteration 8.