Changes to the Cursor Stability Isolation Level: Part 2
Understanding Currently Committed behavior
In my last column, I explained how isolation levels play a key role in preventing databases from becoming inconsistent in multi-user environments. I also showed how the Cursor Stability (CS) isolation level worked both before and after lock avoidance techniques were implemented in IBM DB2 9.5. In this column, I’ll introduce you to the Currently Committed (CC) semantics that were included in DB2 9.7, and I’ll show you how these semantics provide faster data access and increased data concurrency for SQL statements running under the CS isolation level.
Currently committed semantics
DB2 9.7 introduces a new implementation of the CS isolation level that incorporates CC semantics to further prevent writers from blocking readers when this isolation level is used. The intent is to provide a CS isolation level that avoids lock waits without violating CS semantics (like the use of the DB2_SKIPDELETED and DB2_EVALUNCOMMITTED registry variables does—see sidebar, “Registry variables used to delay or avoid acquiring locks in some circumstances”).
Using full lock avoidance techniques (as described in my last column), a read-only SQL statement operating under CC semantics will not acquire a lock as long as DB2 can determine that the data needed has been committed. However, if DB2 is unable to tell whether the data needed has been committed, the transaction executing the statement will try to acquire a lock on the row in question. If a lock can be acquired, processing will continue using traditional CS isolation level behavior. If a lock cannot be acquired (because another transaction holds an Exclusive lock on the row), DB2 will examine the lock that is held by the other transaction to obtain information about the row. The lock will contain one (and only one) of the following:
- No information: Indicates that the row is locked but nothing has been done to it (that is, no uncommitted changes are in flight)
- An Uncommitted Insert identifier: Indicates that the row is a newly inserted row that has not yet been committed
- Log information: Indicates that the row contains uncommitted data; in this case, the log information identifies the log record that corresponds to the first time the row was modified by the transaction that currently holds the lock on the row
If the lock contains no information, the row is treated as if the desired lock was acquired. If the lock contains an Uncommitted Insert identifier, the row is skipped since this identifier represents a row that has not yet been committed. If the lock contains log information, this information will be used to return the CC version of the row (that is, the row as it existed before changes were initiated) from a log record stored in either the log buffer or from a transaction log file. DB2 uses the Log Sequence Number (LSN) to directly access the appropriate log record (see sidebar,“How DB2 can tell if data is committed”).
Figure 1: In this example, Transaction 1 executed two DML statements, which caused log information and an Uncommitted Insert identifier to be written to the lock list for the SALES_REP table. When Transaction 2 queried the SALES_REP table, CC semantics allowed data for locked rows to be read from log records for previously committed transactions; the record for the Uncommitted Insert was not returned.
Figure 1 illustrates how a SELECT statement using the CS isolation level with CC semantics will retrieve records. It is important to note that CC semantics can apply to SQL statements that are executed under both Read Stability (RS) and CS isolation levels. However, under the RS isolation level, CC provides only DB2_SKIPINSERTED behavior, which is the capability to no longer incur lock waits for uncommitted inserted rows.
Registry variables used to delay or avoid acquiring locks in some circumstances
When transactions are run concurrently, the following phenomena can occur:
- DB2_SKIPINSERTED: Allows CS/RS scans to skip uncommitted inserted row
- DB2_SKIPDELETED: Allows CS/RS scans to skip uncommitted deleted rows and index keys
- DB2_EVALUNCOMMITTED: Allows CS/RS scans to apply and perform query predicate evaluation on uncommitted data; also allows the scans to skip uncommitted deleted rows. In effect, it treats the scan as an Uncommitted Read until it finds a qualifying row, at which time it may need to lock to ensure only committed data is processed/returned.
Enabling Currently Committed semantics behavior
By default, CC semantics are turned on for new databases created with DB2 9.7. Existing databases that are upgraded to DB2 9.7 can take advantage of CC semantics by assigning either the value ON or the value AVAILABLE to the cur_commit database configuration parameter of the database that has been converted. If the cur_commit database configuration parameter is set to ON, CC semantics are applied database-wide for both the CS and RS isolation levels. If the cur_commit database configuration parameter is set to AVAILABLE, DB2 will store the appropriate information in locks and perform the extra logging overhead needed (to ensure that the logged data contains the full uncommitted version of the row being changed) to support CC semantics, but CC semantics behavior will have to be enabled on an application-by-application basis. This is done by either binding an embedded SQL application to the database using the CONCURRENTACCESSRESOLUTION USE CURRENTLY COMMITTED option or by specifying the SQL_ATTR_CONCURRENT_ACCESS_RESOLUTION connection attribute with CLI/ODBC and Java applications.
Note that the use of CC semantics requires an increase in log space for update operations to tables that are defined as DATA CAPTURE NONE. This additional space is used to log the first update of a data row by a current transaction; this data is used to retrieve the currently committed image of the row.
How DB2 can tell if data is committed
All data row and index entries have a “flags” byte that contains a “Possibly UNCommitted” (PUNC) bit. If the PUNC bit is not set, the data row/index entry is guaranteed to be committed; otherwise the commit status is unknown.
Pages contain a “pageLSN” that identifies the LSN of the log record that corresponds to the last modification made to the page. If the pageLSN is older than the database’s commitLSN or a table’s readLSN, then the row/key is guaranteed to be committed; otherwise the commit status is unknown.
Maintaining data consistency and integrity, while allowing multiple transactions to access the same data at the same time, can be challenging, particularly in multi-user environments. Transactions, isolation levels, and locks are mechanisms to help maintain data consistency, and the CS isolation level tends to be used more often because it provides the greatest amount of concurrency while preventing dirty reads (which occur when a transaction reads data that hasn’t yet been committed). Prior to DB2 9.7, use of this isolation level could result in delayed response times because of lock waits.
CC semantics provide a whole new implementation of the CS isolation level that does not violate ANSI standards for CS isolation levels. With CC semantics, transactions performing read and write operations avoid lock waits on uncommitted inserts, and transactions performing read-only operations end up trading a lock wait for a log read when they encounter uncommitted updates/deletes from concurrent transactions. This behavior practically guarantees that application response times will no longer be affected by locking.Special thanks to Senior Technical Staff Member–DB2 Kernel Architect Mike Winer for providing information used to develop this article.