Agile Data

Agile Data

Follow @scottwambler on Twitter!

This article is a work in progress.

This article covers:

Introduction to Relational Database Testing

I believe that the virtual absence of discussion about testing within the data management community is the primary cause of the $611 billion annual loss, as reported by The Data Warehouse Institute, experienced by North American organizations resulting from poor data quality. Relational database management systems (RDBMSs) often persist mission-critical data and implement mission-critical functionality. We've known for years that effective testing enables you to improve quality, and in particular testing often and early in the lifecycle can do so dramatically. It seems to me that to improve database quality an important activity, if not the most important one, is to test our databases often (and better yet regressively). Database testing is an important part of agile testing and should be an important part of traditional approaches to testing as well. Figure 1 indicates what you should consider testing when it comes to relational databases. The diagram is drawn from the point of view of a single database, the dashed lines indicate threat boundaries, indicating that you need to consider threats both within the database (clear box testing) and at the interface to the database (black box testing).

Figure 1. What to test.

Functionality Testing in Relational Databases


Stored procedures and triggers. Stored procedures and triggers should be tested just like your application code would be.

Relationship Testing in Relational Databases


Referential integrity (RI). RI rules, in particular cascading deletes in which highly coupled "child" rows are deleted when a parent row is deleted, should also be validated. Existence rules, such as a customer row corresponding to an account row, must exist before the row can be inserted into the Account table, and can be easily tested, too.

Data Quality Testing in Relational Databases


Default values. Columns often have default values defined for them. Are the default values actually being assigned. (Someone could have accidentally removed this part of the table definition.)

Data invariants. Columns often have invariants, implemented in the forms of constraints, defined for them. For example, a number column may be restricted to containing the values 1 through 7. These invariants should be tested.

Validate the attribute size. Is the field size defined in the application is matching with that in the db.

Performance Testing of Relational Databases


Access time to read/write/delete a single row.

Access time for common queries returning multiple rows.

Access time for queries involving several tables.

Existence test for an index. Does the expected index exist or not?

Structural Testing in Relational Databases


Table existence. We can check whether all the data from the application is being inserted into the database properly, or not

View definitions. Views often implement interesting business logic. Things to look out for include: Does the filtering/select logic work properly? Do you get back the right number of rows? Are you returning the right columns? Are the columns, and rows, in the right order?


The following terminology is used throughout this article: