Blogs

New ORDER BY Information: Part 1

Avoiding sorts with data-partitioned secondary indexes

In the latest releases of IBM® DB2® for IBM z/OS® data management, our body of knowledge about ORDER BY has to be revamped. Not only are there many new things to learn, but there are also many changes to old things that we once knew about ORDER BY. I’m going to devote the next few columns to exploring five of these thought-provoking, even worrisome, issues:

  • The impact of a data-partitioned secondary index (DPSI) structure on ORDER BY sort avoidance
  • The impact of using the “random” index option on ORDER BY sort avoidance
  • The issue of having only a GROUP BY COL1, COL2 versus having both a GROUP BY COL1, COL2 and a followup (unnecessary?) ORDER BY COL1, COL2 in your SQL statement
  • The potential of using DYNAMIC SCROLL cursors as a new way of forcing DB2 to avoid a sort for an ORDER BY (or any other sort syntax where an index can be used to avoid the sort)
  • The new (limited) ability to use the UPDATE WHERE CURRENT OF cursor even though there is an ORDER BY in the CURSOR declaration—that is, the cursor is an “unambiguously read-only cursor”

Some background on sort avoidance

Long ago, in one of my Programmers Only columns (“Looking for a little ORDER?”), I explained that DB2 does not automatically build a result set of all qualified rows at OPEN CURSOR. If you have a long-running OPEN CURSOR, it is not because DB2 is building a result set. It is most likely because DB2 is doing a data sort at OPEN (usually to satisfy a common ORDER BY). I do not mind when folks say that DB2, when sorting, builds a SORTOUT file at OPEN CURSOR. Why? Because it’s clear that the work is the result of having to sort. I do take issue with folks saying that DB2 builds a result set at OPEN because calling it a “result set” implies that it is always built for every OPEN and does not explain that the result set is actually a SORTOUT file whose creation and necessity are driven by both syntax and access path. It is not a given, nor is it a requirement.

Remember, sort syntax such as ORDER BY does not necessarily have to cause a data sort. With the appropriate access path (an index is available and by using it the right way, DB2 can return qualified rows back to your program in the desired order), the ORDER requirement can be met without sorting. In fact, the use of an index for sort avoidance is a key way to improve the performance of SQL that qualifies far more rows than will actually ever be FETCHed. If DB2 does not need to sort, then all of the work for finding a qualified row (or rows with multirow FETCH) will be done at each FETCH. You stop FETCHing, and DB2 will stop looking for qualified rows. You will only pay the price of finding the row(s) you actually need.

Now, what do I mean by using the appropriate index the right way? An appropriate index is one that will allow DB2 to retrieve the rows that you want in the order in which you want them without sorting. For example, if you have a four-column index on LASTNAME, FIRSTNAME, MIDINIT, EMPID (all ascending), and you code the following SQL statement: SELECT … FROM BIGTABLE WHERE LASTNAME = :HVLASTNAME followed by any of the following ORDER BY clauses:

ORDER BY FIRSTNAME, MIDINIT, EMPID
ORDER BY FIRSTNAME, MIDINIT
ORDER BY FIRSTNAME
ORDER BY FIRSTNAME DESC, MIDINIT DESC,
EMPID DESC
ORDER BY FIRSTNAME DESC, MIDINIT DESC
ORDER BY FIRSTNAME DESC

then DB2 could use the four-column index the “traditional” way (index, table, index, table, etc.) to return the rows in the desired order (the same order as or the reverse order of the index). If DB2 were to use the index the nontraditional, LIST PREFETCH way, the rows would not come back in the order of the index, but rather in the order of the table. That is, the rows would come back in table CLUSTER order if the table were freshly REORGed (with no intervening maintenance or “log apply” impact on its perfect order). When DB2 does not have an index in the appropriate order for sort avoidance or when DB2 uses any index the LIST PREFETCH way, then an ORDER BY will most certainly cause a data sort. Now on to the five new issues mentioned earlier.

>Issue #1: The impact of DPSI structure on sorts

By creating a DPSI, you are essentially taking what would have been a single ordered list (once a non-partitioning index [NPI], now renamed non-partitioned secondary index [NPSI]) with row identifier (RID) pointers to all partitions of a table space and turning it into multiple ordered lists, each with pointers to only one partition. Since our subject index is not in the same order as our table (it is not our CLUSTER index), the order of an NPSI would be random to the entire table while the order of each index partition of a DPSI would be random only to its own partition and (we like this) has the probability of being a narrower (fewer leaf pages) and shorter (fewer levels) index. These are especially good features when you are accessing rows from a single partition. We are not so pleased with the DPSI if we must probe multiple index trees to find rows across multiple partitions. If we make this index our CLUSTER index, as an NPSI, the index would still be random to the table (partitioned on LASTNAME) as a whole, but as a DPSI, each index partition would be aligned (not random) to its table partition. Again, this is a great feature when you are accessing rows from a single partition. This is, of course, the ideal scenario. The less-than-ideal scenario of using a DPSI to access multiple partitions can be very painful from a SQL performance perspective. More on this later.

Figure 1 shows an example of two indexes: one a DPSI and the other an NPSI on the column ZIPCODE. The table is in the middle of the picture. The column data on the table are EMPID, LASTNAME, HIREDATE, and ZIPCODE. The table is currently CLUSTERed on LASTNAME.

Figure 1: Example of DPSI and NPSI indexes on the column ZIPCODE

Because the NPSI is a single ordered list, it is easy to see how DB2 could just toggle back and forth, index to table, index to table, and return rows in ZIPCODE order without sorting. The reads would be random (ouch), but for some transactions a few random reads is a small price to pay in exchange for eliminating a large data sort. In other words, many rows qualify but the program only FETCHes 10 rows; for example:

SELECT … FROM BIGTABLE (for example, 60 million rows)
WHERE ZIPCODE BETWEEN 10000 AND 70000 ORDER BY ZIPCODE
FETCH FIRST 10 ROWS ONLY

The FETCH FIRST clause discourages DB2 from using our index the LIST PREFETCH way (in which case, because of the intervening RID sort, DB2 would need to sort to satisfy the ORDER BY) and encourages DB2 to use the index the traditional way to avoid a large data sort.

However, look at the alternative DPSI; we have three ordered lists (one per partition). Obviously, we have no problem with sort avoidance if we code:

SELECT … FROM BIGTABLE (for example, 60 million rows)
WHERE ZIPCODE BETWEEN 10000 AND 70000
AND LASTNAME BETWEEN 'A                ' AND 'G99999'
ORDER BY ZIPCODE
FETCH FIRST 10 ROWS ONLY

Even though LASTNAME is not part of our ZIPCODE index, DB2 knows that the rows we want can only be in Partition 1. Therefore, it can use Partition 1 of our index the traditional way to avoid a sort on ZIPCODE. But what if we code the following SQL?

SELECT … FROM BIGTABLE (for example, 60 million rows)
WHERE ZIPCODE BETWEEN 10000 AND 70000
AND LASTNAME BETWEEN 'A     ' AND 'T9999'
ORDER BY ZIPCODE
FETCH FIRST 10 ROWS ONLY

DB2 knows that our desired rows cross multiple partition boundaries, and it knows that it will need to either do a table space scan across multiple partitions or use more than one index partition, each in its own order, to find each row. The FETCH FIRST clause encourages sort avoidance. So what does DB2 do?

DB2’s solution

With smoke and mirrors and a bit of match/merge logic, DB2 can use multiple partitions of a DPSI for sort avoidance. Obviously, this is not as trivial as using a single index’s ordered list, but it can be done. The DB2 Optimizer knows that in our case, three partitions (but in others maybe 64 or even more than 4,000 index partitions) must be read in a match/merge fashion in order to return the first 10 rows in ZIPCODE order, but DB2 can factor that possibility into its cost-based decision (see APAR PM25934 (open right now), which is refining DPSI costing). Without the FETCH FIRST (or similarly, but not exactly, without an OPTIMIZE FOR clause), then DB2 would almost certainly prefer some other access path.

The important point here is that you can choose to define indexes as DPSIs without losing that index’s aptitude for sort avoidance, including the sort avoidance required for DYNAMIC SCROLL cursors. You just need to realize that we have been warned about the negatives of DPSIs in cross-partition processing, including the multiple index probes needed for sort avoidance. So take heed.

Negatives of DPSIs

Just in case you think that I am very pro-DPSI, I feel that it is important to stress some of their negatives:

  • Potentially, far more probes and getpages.
  • The loss of index lookaside. An NPSI would take advantage of lookaside on the index, for example, with a nested loop join in which the outer rows drive sequential access to the inner. For an NPSI, this results in index lookaside and sequential detection. For a DPSI where qualified rows are in more than one partition, every probe skips between partitions. The partition skipping breaks index lookaside and can also disrupt sequential detection.

Wrap-up/h2>
Knowing that we do not give up sort avoidance for ORDER BYs with DPSIs is good news. The next few columns will continue with the other four issues regarding ORDER BY (or the lack thereof) on our SQL.