Taking a Database’s Temperature
Apply Index Read Efficiency to help effectively boost database performance
One powerful tool in a database administrator (DBA)’s toolbox is a specific form of analysis that was developed and evangelized by IBM Gold Consultant Scott Hayes during a recent International DB2 Users Group (IDUG) technical conference. The Index Read Efficiency (IREF) tool offers an excellent way to see the overall health of a database.
Time-tested methods such as bufferpool hit ratios, package cache hit ratios, and sort overflows are still important, but the IREF analysis can deepen and strengthen database performance. IREF is a lesson in efficiency rather than a metric to signal pain. Where a low bufferpool hit ratio is a red flag that the database is hurting, IREF is much more subtle. It shows where SQL is doing unnecessary work that can lead to elevated resource utilization.
IREF is an implied metric. It is not an explicit metric collected by the IBM® DB2® database in system tables. However, the math is simple; it is nothing more than looking at the ratio between rows read and rows returned, based on the formula IREF = rows read/rows returned.
IREF shows effort. How many rows were scanned to return each individual row the DBA was seeking? Are indexes dead on, in which one row is scanned for one row returned? Is there an opportunity for index improvements in which 100 rows are scanned for the one row the DBA is looking for? Could missing or poor predicates in queries be causing scanning even though good indexes exist? Generally, good indexes that guide the query efficiently to the desired rows will yield a reduced IREF value.
There are multiple ways to gather data to calculate IREF. The data is scattered through various system tables. Be creative with the SQL or script developed to mine this data. A place to start is to check out the following tables:
MON_GET_PKG_CACHE_STMT (ROWS_READ, ROWS_RETURNED)
MON_CURRENT_UOW (ROWS_READ, ROWS_RETURNED)
SYSIBMADM.SNAPSTMT (ROWS_READ, FETCH_COUNT)
SYSIBMADM.SNAPDYN_SQL (ROWS _READ, ROWS_WRITTEN)
An example of SQL built from the SYSIBMADM.SNAPSTMT table is shown in the figure.
IREF calculation using the SYSIBMADM.SNAPSTMT table
Notice the polar opposite results. The first query scans 7,264 rows to select each row needed; the second query scans only three rows to arrive at the row it seeks. An IREF result means little without context or a gauge to measure results. In general, the following result ranges are recommended by performance expert Hayes to measure efficiency in a transaction processing environment:
- IREF – Doing well
- IREF >= 10 – Cause for investigation
- IREF > 100 – Likely room for improvement
When looking at a data warehouse, the metrics become more challenging. Because of the amount of data and types of transactions, getting to these recommended numbers may not be possible. Comparing IREFs among similar workloads to determine what could be achieved with enhanced indexing or further tuning is not an unrealistic approach.
Taking analysis one step further, DBAs can start to categorize tables by workload types. For example, assume the IREF metric is applied to SQL for a specific table over a few iterations. Through each iteration, row scanning may be improved as development is engaged to write enhanced application code. The SQL workload is optimally tuned. Proper indexing may help eliminate additional scanning. Yet, a table still having some significant row scanning may not be a bad thing—it just shows a true workload type. The table may just be asynchronous in nature, and additional scanning may be happening because of prefetching or proactive page cleaning.
Applying an elegant concept
This information is good to know when tuning. The DBA can consider further measures such as isolating asynchronous and synchronous tables into similar bufferpools. The IREF concept is elegantly simple but very powerful. It allows a DBA to move past pain points to understand how efficiently things are being processed. Take a look at a long-established, well-tuned database in any environment, and apply this method to some SQL. Doing so can elevate performance to the next level.
Please share any thoughts or questions in the comments.