Introduction to Transaction Control
AgileData.org: Techniques for Disciplined Agile Database Development
|Transactions are collections of actions that potentially modify two or more entities. An example of a transaction is a transfer of funds between two bank accounts. The transaction consists of debiting the source account, crediting the target account, and recording the fact that this occurred. An important message of this article is that transactions are not simply the domain of databases, instead they are issues that are potentially pertinent to all of your architectural tiers. Therefore it behooves all IT professionals to understand the fundamentals of transaction control.||
Let’s start with a few definitions. Bernstein and Newcomer (1997) distinguish between:
Business transactions. A business transaction is an interaction in the real world, usually between an enterprise and a person, where something is exchanged.
Online transaction. An online transaction is the execution of a program that performs an administrative or real-time function, often by accessing shared data sources, usually on behalf of an online user (although some transactions are run offline in batch). This transaction program contains the steps involved in the business transaction. This definition of an online transaction is important because it makes it clear that there is far more to this topic than database transactions.
A transaction-processing (TP) system is the hardware and software that implements the transaction programs. A TP monitor is a portion of a TP system that acts as a kind of funnel or concentrator for transaction programs, connecting multiple clients to multiple server programs (potentially accessing multiple data sources). In a distributed system, a TP monitor will also optimize the use of the network and hardware resources. Examples of TP monitors include IBM’s Customer Information Control System (CICS), IBM’s Information Management System (IMS), BEA’s Tuxedo, and Microsoft Transaction Server (MTS).
The focus of this article is on the fundamentals of online transactions (e.g. the technical side of things). The critical concepts are:
An important fundamental of transactions are the four properties that they must exhibit:
|As the name suggests there are two phases to the 2PC protocol: the attempt phase where each system tries its part of the transaction and the commit phase where the systems are told to persist the transaction. The 2PC protocol requires the existence of a transaction manager to coordinate the transaction. The transaction manager will assign a unique transaction ID to the transaction to identify it. The transaction manager then sends the various transaction steps to each system of record so they may attempt them, each system responding back to the transaction manager with the result of the attempt. If an attempted step succeeds then at this point the system of record must lock the appropriate entities and persist the potential changes in some manner (to ensure durability) until the commit phase. Once the transaction manager hears back from all systems of record that the steps succeeded, or once it hears back that a step failed, then it either sends out a commit or abort request to every system involved.|
So far I have discussed flat transactions, transactions whose steps are individual activities. A nested transaction is a transaction where some of its steps are other transactions, referred to as subtransactions. Nested transactions have several important features:
When a program starts a new transaction, if it already inside of an existing transaction then a subtransaction is started otherwise a new top level transaction is started.
There does not need to be a limit on the depth of transaction nesting.
When a subtransaction aborts then all of its steps are undone, including any of its subtransactions. However, this does not cause the abort of the parent transaction, instead the parent transaction is simply notified of the abort.
When a subtransaction is executing the entities that it is updating are not visible to other transactions or subtransactions (as per the isolation property).
When a subtransaction commits then the updated entities are made visible to other transactions and subtransactions.
Although transactions are often thought of as a database issue the reality could be further from the truth. From the introduction of TP monitors such as CICS and Tuxedo in the 1970s and 1980s, to the CORBA-based object request brokers (ORBs) of the early 1990s to the EJB application servers of the early 2000s transaction have clearly been far more than a database issue. This section explores three approaches to implementing transactions that involve both object and relational technology. This material is aimed at application developers as well as Agile DBAs that need to explore strategies that they may not have run across in traditional data-oriented literature. These implementation options are:
The simplest way for an application to implement transactions is to use the features supplied by the database. Transactions can be started, attempted, then committed or aborted via SQL code. Better yet, database APIs such as Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC) provide classes that support basic transactional functionality.
At the time of this writing support for transaction control is one of the most pressing issues in the web services community and full support for nested transactions is underway within the EJB community. As you see in Figure 1, databases aren’t the only things that can be involved in transactions. The fact is that objects, services, components, legacy applications, and non-relational data sources can all be included in transactions.
Figure 1. Transactions can involve more than just databases.
The advantage of adding behaviors implemented by objects (and similarly services, components, and so on) to transactions are that they become far more robust. Can you imagine using a code editor, word processor, or drawing program without an undo function? If not, then I believe it becomes reasonable to expect both behavior invocation as well as data transformations as steps of a transaction. Unfortunately this strategy comes with a significant disadvantage – increased complexity. For this to work your business objects need to be transactionally aware. Any behavior that can be invoked as a step in a transaction requires supporting attempt, commit, and abort/rollback operations. Adding support for object-based transactions is a non-trivial endeavor.
Just like it is possible to have distributed data transactions it is possible to have distributed object transactions as well. To be more accurate, as you see in Figure 1 it’s just distributed transactions period – it’s not just about databases any more, but it’s databases plus objects plus services plus components plus… and so on.
Sometimes you find that you need to include a
non-transactional source within a transaction.
A perfect example is an update to information contained in an LDAP
directory or the invocation of a web service, neither of which at the time of
this writing support transactions. The
problem is as soon as a step within a transaction is non-transactional the
transaction really isn’t a transaction any more. You have four basic strategies available to you for dealing
with this situation:
Remove the non-transactional step from your transaction. In practice this is rarely an option, but if it's a viable strategy then consider doing so.
Implement commit. This strategy, which could be thought of as the “hope the parent transaction doesn’t abort” strategy, enables you to include a non-transactional step within your transaction. You will need to simulate the attempt, commit, and abort protocol used by the transaction manager. The attempt and abort behaviors are simply stubs that do nothing other than implement the requisite protocol logic. The one behavior that you do implement, the commit, will invoke the non-transactional functionality that you want. A different flavor of this approach, which I’ve never seen used in practice, would put the logic in the attempt phase instead of the commit phase.
Implement attempt and abort. This is an extension to the previous technique whereby you basically implement the “do” and “undo” logic but not the commit. In this case, the work is done in the attempt phase; the assumption is that the rest of the transaction will work, but if it doesn’t, you still support the ability to roll back the work. This is an “almost transaction” because it doesn’t avoid the problems with collisions described earlier.
Make it transactional. With this approach, you fully implement the requisite attempt, commit, and abort behaviors. The implication is that you will need to implement all the logic to lock the affected resources and to recover from any collisions. An example of this approach is supported by the J2EE Connector Architecture (JCA), in particular by the LocalTransaction interface.
Which approach should you take? I prefer strategies #1 and #4 – when it comes to
transactions I want to do it right or not do it at all.
The problem with implementing full transactional logic is that it can be
a lot of work. I’ll consider the
attempt and abort strategy when it is possible to live with the results of a
collision, and strategy #2 as a last resort. A major issue is that
strategy #4 is the only one to pass the ACID test.
A major issue is that strategy #4 is the only one to pass the ACID test.
This book describes, in detail, how to refactor a database schema to improve its design. The first section of the book overviews the fundamentals evolutionary database techniques in general and of database refactoring in detail. More importantly it presents strategies for implementing and deploying database refactorings, in the context of both "simple" single application databases and in "complex" multi-application databases. The second section, the majority of the book, is a database refactoring reference catalog. It describes over 60 database refactorings, presenting data models overviewing each refactoring and the code to implement it.
We actively work with clients around the world to improve their information technology (IT) practices, typically in the role of mentor/coach, team lead, or trainer. A full description of what we do, and how to contact us, can be found at Scott W. Ambler + Associates.