Introduction to Transaction Control

Transactions are collections of actions that potentially modify two or more entities. An example of a transaction is a transfer of funds between two bank accounts. The transaction consists of debiting the source account, crediting the target account, and recording the fact that this occurred. Transactions are not simply the domain of databases; instead, they are issues that are potentially pertinent to all of your architectural tiers. Therefore, it behooves all developers to understand the fundamentals of transaction control.

1. Introduction to Transaction Control

Let’s start with a few definitions. Bernstein and Newcomer (1997) distinguish between:

Business transactions. A business transaction is an interaction in the real world where something is exchanged.
Online transaction. An online transaction is the execution of a program that performs an administrative or real-time function, often by accessing shared data sources, usually on behalf of an online user. This transaction program contains the steps involved in the business transaction. This definition of an online transaction makes it clear that there is far more to this topic than database transactions.

A transaction-processing (TP) system is the hardware and software that implements the transaction programs. A TP monitor is a portion of a TP system that acts as a kind of funnel or concentrator for transaction programs, connecting multiple clients to multiple server programs. In a distributed system, a TP monitor will also optimize the use of the network and hardware resources. Examples of TP monitors include IBM’s Customer Information Control System (CICS) and IBM’s Information Management System (IMS).

The focus of this article is on the fundamentals of online transactions (e.g. the technical side of things). The critical concepts are:

2. Transaction Control: The ACID Properties

An important fundamental of transactions are the four properties that they must exhibit:

Atomicity. The whole transaction occurs or nothing in the transaction occurs; there is no in between. In SQL, the changes become permanent when a COMMIT statement is issued, and are aborted on a ROLLBACK statement. For example, the transfer of funds between two accounts is a transaction. If we transfer $20 from account A to account B, then at the end of the transaction A’s balance will be $20 lower and B’s balance will be $20 higher (if the transaction is completed) or neither balance will have changed (if the transaction is aborted).
Consistency. When the transaction starts the entities are in a consistent state, and when the transaction ends the entities are once again in a consistent, albeit different, state. The implication is that the referential integrity rules and applicable business rules still apply after the transaction is completed.
Isolation. All transactions work as if they alone were operating on the entities. For example, assume that a bank account contains $200 and each of us is trying to withdraw $50. Regardless of the order of the two transactions, at the end of them the account balance will be $100, assuming that both transactions work. This is true even if both transactions occur simultaneously. Without the isolation property two simultaneous withdrawals of $50 could result in a balance of $150 (both transactions saw a balance of $200 at the same time, so both wrote a new balance of $150). Isolation is often referred to as serializability.
Durability. The entities are stored in a persistent media, such as a relational database or file, so that if the system crashes the transactions are still permanent.

3. Transaction Control: Two-Phase Commits (2PC)

As the name suggests there are two phases to the 2PC protocol: the attempt phase where each system tries its part of the transaction and the commit phase where the systems are told to persist the transaction. The 2PC protocol requires the existence of a transaction manager to coordinate the transaction. The transaction manager will assign a unique transaction ID to the transaction to identify it. The transaction manager then sends the various transaction steps to each system of record so they may attempt them, each system responding back to the transaction manager with the result of the attempt. If an attempted step succeeds then at this point the system of record must lock the appropriate entities and persist the potential changes in some manner (to ensure durability) until the commit phase. Once the transaction manager hears back from all systems of record that the steps succeeded, or once it hears back that a step failed, then it either sends out a commit or abort request to every system involved.

4. Transaction Control: Nested Transactions

So far I have discussed flat transactions, transactions whose steps are individual activities. A nested transaction is a transaction where some of its steps are other transactions, referred to as subtransactions. Nested transactions have several important features:

When a program starts a new transaction, if it already inside of an existing transaction then a subtransaction is started otherwise a new top level transaction is started.
There does not need to be a limit on the depth of transaction nesting.
When a subtransaction aborts then all of its steps are undone, including any of its subtransactions. However, this does not cause the abort of the parent transaction, instead the parent transaction is simply notified of the abort.
When a subtransaction is executing the entities that it is updating are not visible to other transactions or subtransactions (as per the isolation property).
When a subtransaction commits then the updated entities are made visible to other transactions and subtransactions.

5. Transaction Control: Implementing Transactions

Although transactions are often thought of as a database issue the reality could be further from the truth. From the introduction of TP monitors such as CICS and Tuxedo in the 1970s and 1980s, to the CORBA-based object request brokers (ORBs) of the early 1990s to the EJB application servers of the early 2000s transaction have clearly been far more than a database issue. This section explores three approaches to implementing transactions that involve both object and relational technology. This material is aimed at application developers as well as Agile data engineers that need to explore strategies that they may not have run across in traditional data-oriented literature. These implementation options are:

5.1 Database Transactions

The simplest way for an application to implement transactions is to use the features supplied by the database. Transactions can be started, attempted, then committed or aborted via SQL code. Better yet, database APIs such as Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC) provide classes that support basic transactional functionality.

5.2 Object Transactions

At the time of this writing support for transaction control is one of the most pressing issues in the web services community and full support for nested transactions is underway within the EJB community. As you see in Figure 1, databases aren’t the only things that can be involved in transactions. The fact is that objects, services, components, legacy applications, and non-relational data sources can all be included in transactions.

Figure 1. Transactions can involve more than just databases.

The advantage of adding behaviors implemented by objects (and similarly services, components, and so on) to transactions are that they become far more robust. Can you imagine using a code editor, word processor, or drawing program without an undo function? If not, then I believe it becomes reasonable to expect both behavior invocation as well as data transformations as steps of a transaction. Unfortunately this strategy comes with a significant disadvantage – increased complexity. For this to work your business objects need to be transactionally aware. Any behavior that can be invoked as a step in a transaction requires supporting attempt, commit, and abort/rollback operations. Adding support for object-based transactions is a non-trivial endeavor.

5.3 Distributed Object Transactions

Just like it is possible to have distributed data transactions it is possible to have distributed object transactions as well. To be more accurate, as you see in Figure 1 it’s just distributed transactions period – it’s not just about databases any more, but it’s databases plus objects plus services plus components plus”¦ and so on.

5.4 Including Non-Transactional Steps

Sometimes you find that you need to include a non-transactional source within a transaction. A perfect example is an update to information contained in an LDAP directory or the invocation of a web service, neither of which at the time of this writing support transactions. The problem is as soon as a step within a transaction is non-transactional the transaction really isn’t a transaction any more. You have four basic strategies available to you for dealing with this situation:

Remove the non-transactional step from your transaction. In practice this is rarely an option, but if it’s a viable strategy then consider doing so.
Implement commit. This strategy, which could be thought of as the “hope the parent transaction doesn’t abort” strategy, enables you to include a non-transactional step within your transaction. You will need to simulate the attempt, commit, and abort protocol used by the transaction manager. The attempt and abort behaviors are simply stubs that do nothing other than implement the requisite protocol logic. The one behavior that you do implement, the commit, will invoke the non-transactional functionality that you want. A different flavor of this approach, which I’ve never seen used in practice, would put the logic in the attempt phase instead of the commit phase.
Implement attempt and abort. This is an extension to the previous technique whereby you basically implement the “do” and “undo” logic but not the commit. In this case, the work is done in the attempt phase; the assumption is that the rest of the transaction will work, but if it doesn’t, you still support the ability to roll back the work. This is an “almost transaction” because it doesn’t avoid the problems with collisions described earlier.
Make it transactional. With this approach, you fully implement the requisite attempt, commit, and abort behaviors. The implication is that you will need to implement all the logic to lock the affected resources and to recover from any collisions. An example of this approach is supported by the J2EE Connector Architecture (JCA), in particular by the LocalTransaction interface.

Which approach should you take? I prefer strategies #1 and #4 – when it comes to transactions, I want to do it right or not do it at all. The problem with implementing full transactional logic is that it can be a lot of work. I’ll consider the attempt-and-abort strategy when it is possible to live with the results of a collision, and strategy #2 as a last resort. A major issue is that strategy #4 is the only one to pass the ACID test.