Blogs

Reducing Conversations with DB2 for z/OS: Part 3

Different ways of thinking about ordinary things

In my two previous columns, I wrote about unnecessary SQL and gave examples of SQL that is executed far too many times and SQL that should be replaced with newer, better performing SQL. In this column, I will tell you about more situations where SQL can be deferred, combined, or avoided completely, all with the goal of reducing conversations with DB2.

Three steps to remove unnecessary re-reads

Screen-display applications are notorious for unnecessary re-reads. We read to get an image to display; the user changes values on the screen and hits a function key to update; the program again takes control and re-reads the row (usually this time with a CURSOR using FOR UPDATE OF). After the FETCH, the program does “before and after” checking to see if anything on the row was changed by a different user while our user was staring at the screen display.

If the row is the same as the originally displayed row, our user’s UPDATE is done. If any column on the row has changed, the appropriate information/error is provided to the user on the returned display.

The solution for this unnecessary SQL problem is to use an “optimistic locking” technique.

We are confident that most, if not all, of the time, our row will not change during the screen display. Our UPDATE glass is half full, not half empty. Therefore, we will do our UPDATE without re-reading the row. But we are neither foolish nor stupid; we will use a tried-and-true technique to ensure that our UPDATE is done to an unchanged row: we include in our WHERE clause one or more predicates that test for change. We have many options to use to test for change, some allowing more throughput than others. For example, we can:

  1. Design our table to include a DB2-maintained ROW UPDATE TIMESTAMP (new in DB2 9), and include a predicate in our WHERE clause to check that column to see if its value is the same as the originally read column value.
  2. Compare a user-maintained last-update timestamp to its original value (and reset that timestamp in our SET clause) to make sure that nothing on the row has changed since we received our image of the row.
  3. Include WHERE clause predicates for every column on the row to see if each column value is the same as its original value.
  4. Include WHERE clause predicates for only the columns that are important and relevant to us. We definitely want to make sure that the row still qualifies, and we may want to verify that columns we selected have not changed.

Major changes to improve performance

Say you have an index on TRANDATE, PROCESS-DATE, and CUSTNO. You want to FETCH rows and write to a report all rows that will be deleted if they are: older than three months, inactive, and have a CUSTNO in a list found in your program’s working storage. You can:

Declare Cursor csr-delete-old-rows for
Select ponbr, custno, trandate, process-date
From bigtable
Where trandate < current date – 3 months
And status = :hv-inactive
Open csr-delete-old-rows
Fetch csr-delete-old-rows into
:hvponbr, :hvcustno, :hvtrandate, :hvprocess-date
Write to a report
Delete from bigtable
Where current of csr-delete-old-rows

If there are 3,000 index rows pointing to table rows that have a TRANDATE older than three months, but only 500 of those 3,000 table rows have a STATUS that is “inactive,” then the worst-case scenario is that you will:

  1. CONNECT to DB2 once for the OPEN to do CURSOR setup
  2. CONNECT to DB2 for each FETCH. Then, for each of the 3,000 connects:
  3. Do multiple GET PAGE requests for the pages in the multi-level index tree
  4. Read an average of six index entries
  5. For each of the six index RIDs, do a GET PAGE request to the table, followed by a potential synchronous wait on I/O
  6. After the GET PAGE requests (and read I/Os) to the table, apply the STATUS predicate to the six table rows, reject five and accept one
  7. Return a row to the program
  8. Of the 500 qualified rows, only 25 of the CUSTNOs are in the working storage list (I call conditions that reject rows in your program after the return from DB2 “Stage 3 predicates”—and Stage 3 predicates are far worse than Stage 2 predicates)
  9. CONNECT to DB2 again for each fully qualified row, this time to do the DELETE (25 connects)
  10. Repeat 24 times for a grand total of 528 CONNECTs and way too many GET PAGEs to both the index and the table

Now what can we do to reduce the number of conversations in this situation? I suggest four improvements (you will remember the advice in step 2 from the last column).

  1. If feasible, add STATUS to the index to make the qualification index-only
  2. Change the CURSOR to use ROWSET POSITIONING to read more than one row at a time
  3. Use SELECT FROM DELETE to see the rows
  4. Get rid of the STAGE 3 disqualification (the program check of CUSTNO) by INSERTing the numbers into a CREATED GLOBAL TEMPORARY TABLE (e.g., CTT_CUST) and by using that temporary table in your SQL

Declare Cursor csr-select-and-delete-old-rows
With rowset positioning
For
Select ponbr, custno, trandate, process-date
From final_table
(Delete from bigtable
Where trandate < current date – 3 months
And status = :hv-inactive
And custno in (select custno from ctt_cust)
Open csr-select-and-delete-old-rows
(One CONNECT)

Fetch rowset from final_table
For 100 rows into
:hvponbr-array, :hvcustno-array, :hvtrandate-array, :hvprocess-date-array
(One CONNECT with a +100 SQLCODE returned for row 26)
Close csr-select-and-delete-old-rows
(One CONNECT)

The new best-case scenario is that you will:

  1. CONNECT to DB2 once for the OPEN to do CURSOR setup
  2. CONNECT to DB2 for the one FETCH and
  3. Do GET PAGE requests for the pages in the index tree
  4. Read 25 index entries, fully qualifying the rows before reading the table
  5. For each of the 25 RIDs, do a GET PAGE request to the table (if we are lucky and some of our rows share the same pages, we will do fewer GET PAGEs)
  6. For each GET PAGE we may have to do a READ I/O
  7. As each row is found, DELETE it
  8. Insert the SELECTed columns of each row into final_table
  9. Return 25 rows to the program
  10. Write to the report:

Close csr-select-and-delete-old-rows
(One CONNECT)

Leveraging the preceeding four techniques reduced the conversations from 528 to 3 while lowering the number of table GET PAGE requests from 3,000 to at most 25.

Eliminating Stage 3

I’d like to elaborate on the fourth suggested improvement (Get rid of the STAGE 3 disqualification). One of the most common coding missteps I see as a consultant is the practice of validating and rejecting rows in the program instead of in DB2.

These STAGE 3 predicates, especially those that reject a majority of the rows, are usually the result of old VSAM logic that did not utilize a WHERE clause. Back then, we used keyed reads and then applied program-coded IF/THEN/ELSE logic to the records. IF/THEN/ELSE logic is fertile ground for figuring out how to use WHERE clause predicates so rows can be rejected sooner in DB2’s address space.

This is your conscience speaking

I wrote this three-part series to emphasize the importance of totally eliminating or deferring—until the very last moment—conversations with DB2. In the future, as you code each SQL statement, I hope a little Bonnie will be sitting on your shoulder, shaking my finger and\asking, “Is this necessary? Can you do this read/maintenance to more than one row at a time? Can you defer this SQL to a later time, until you are absolutely sure that it needs to be done?”

Resource

DB2 for z/OS