Agile Data

Agile Data

Follow @scottwambler on Twitter!

This article is a work in progress.

This article covers:


Introduction to Relational Database Testing

I believe that the virtual absence of discussion about testing within the data management community is the primary cause of the $611 billion annual loss, as reported by The Data Warehouse Institute, experienced by North American organizations resulting from poor data quality.  Relational database management systems (RDBMSs) often persist mission-critical data and implement mission-critical functionality. We've known for years that effective testing enables you to improve quality, and in particular testing often and early in the lifecycle can do so dramatically. It seems to me that to improve database quality an important activity, if not the most important one, is to test our databases often (and better yet regressively).  Database testing is an important part of agile testing and should be an important part of traditional approaches to testing as well. Figure 1 indicates what you should consider testing when it comes to relational databases. The diagram is drawn from the point of view of a single database, the dashed lines indicate threat boundaries, indicating that you need to consider threats both within the database (clear box testing) and at the interface to the database (black box testing). 


Figure 1. What to test.



Functionality Testing in Relational Databases

TBD

Stored procedures and triggers. Stored procedures and triggers should be tested just like your application code would be.



Relationship Testing in Relational Databases

TBD

Referential integrity (RI). RI rules, in particular cascading deletes in which highly coupled "child" rows are deleted when a parent row is deleted, should also be validated. Existence rules, such as a customer row corresponding to an account row, must exist before the row can be inserted into the Account table, and can be easily tested, too.



Data Quality Testing in Relational Databases

TBD

Default values. Columns often have default values defined for them. Are the default values actually being assigned. (Someone could have accidentally removed this part of the table definition.)

Data invariants. Columns often have invariants, implemented in the forms of constraints, defined for them. For example, a number column may be restricted to containing the values 1 through 7. These invariants should be tested.

Validate the attribute size. Is the field size defined in the application is matching with that in the db.


 


Performance Testing of Relational Databases

TBD

Access time to read/write/delete a single row.

Access time for common queries returning multiple rows.

Access time for queries involving several tables.

Existence test for an index. Does the expected index exist or not?



Structural Testing in Relational Databases

TBD

Table existence. We can check whether all the data from the application is being inserted into the database properly, or not
 

View definitions. Views often implement interesting business logic. Things to look out for include: Does the filtering/select logic work properly? Do you get back the right number of rows? Are you returning the right columns? Are the columns, and rows, in the right order?



Terminology

The following terminology is used throughout this article:

  • Database testing. The act of validating the contents, schema, and functionality within a database. From the point of view of a relational database this includes the actual data itself, the table structures and relationships between tables, and the stored procedures/functions or database classes respectively.
  • Database interface testing. Database testing which validates the database at the black-box level. See Figure 1.
  • Internal database testing. Database testing which validates the database at the white/clear-box level. See Figure 1.
  • Database regression testing. The act of running the database test suite on a regular basis, ideally whenever someone does something which could potentially inject a defect into the database such as change how they write data into a database or change some of the code within the database itself.
  • Test Driven Database Development (TDDD). Also known as "Behavior Driven Database Development" (BDDD), this is the act of specifying the design of a database by writing a single test then just enough database code/schema to fulfill that test.