Right Data. Right Place. Right Time.

Storage tiering for the DB2 database administrator

To date, DBAs have had little say in what type of storage their databases lived on: they simply asked for a given amount of space and it was provided by the storage administrator. Now, however, it’s important for DBAs to think about the types of storage that they use, because storage isn’t “just storage” anymore. There are high speeds, low speeds, and in-between speeds, and each is best suited for different uses.

The challenge lies in deciding where to put data to achieve the best return on investment (ROI). This article will explore how to create a tiered approach to storage with IBM DB2 for z/OS and achieve the ultimate goals of getting the right data in the right place at the right time to increase performance optimization and reduce total cost of ownership (TCO).

Disk storage tiers: Choices, choices

Never before has a DB2 DBA faced so many storage options with so many price and performance variables. At the high end, solid-state drives (SSDs) that can comfortably support random I/O access patterns at a rate of 5,000 I/O operations per second (IOPS) are an attractive choice for demanding, I/O-intensive applications. However, they can be relatively expensive.

On the other end of the scale, Serial ATA (SATA) drives can be as large as 2 TB but typically deliver only around 75 IOPS. They perform reasonably well for sequential workloads but do not do so well with random I/O access patterns. Between these extremes sit the commonly used Fibre Channel (FC) drives that have been the backbone of enterprise storage arrays for many years. FC drives deliver 180 IOPS with acceptable response time (less than 5 ms). FC drive capacities are growing very quickly but appear to have reached a speed/response plateau.

However, individual devices are just part of the equation. Drives of all types are arranged in enterprise storage arrays in ways that deliver different price/performance ratios. For example, RAID-10 (striped mirrors) performs better but is more costly than RAID-5. Similarly, RAID-5 (striped single parity) has some performance advantages over RAID-6 (striped double parity), yet RAID-6 provides higher protection against data loss.

You may have some or all of these types of storage available to you, or you may have other variations. Whatever your environment, start your storage tiering plan by defining your tiers. High-cost, high performance storage becomes your gold tier, followed by less-expensive and less-responsive silver and bronze tiers.

The challenge of choosing tiers

If defining storage tiers is relatively straightforward, placing the right data on them often isn’t. The decision starts with the broad parameters of the project: Is the database critical to the business? What are the performance requirements? What are the availability requirements?

From there, the qualifications become more complicated. Within a single DB2 system, not all data is the same; it might make sense to have data for one database stored, in part, on each tier. Even within production data, some tables are rarely accessed while others are accessed continually. In a final twist, access profiles can vary over time (see Figure 1).

You also need to consider characteristics of the I/O that makes up your storage workload. Is it random or sequential? Read or write? What are the ratios of reads to writes? Are the I/Os small or large? Understanding your workload will allow you to make appropriate storage tiering decisions.

Characterizing the workload

Workloads have different intensities when viewed in time and when viewed from the storage perspective. In addition to the intensity, the type of workload can have a bearing on which storage tier is optimal for deployment. All I/Os are not equal!

Consider a write operation: Most enterprise storage arrays cache an inbound write and return a channel end/device end acknowledgment immediately to the host. This process is usually referred to as a direct access storage device (DASD) fast write. At a later time, the write is “destaged” to physical storage. In this case, does it really matter what the underlying storage is? To further muddy the waters, writes from the DB2 buffer pool are usually written asynchronously with respect to the transactional workload and are therefore a lower priority than, say, synchronous reads.

Sequential read workloads, such as those generated during a tablespace scan, are also interesting. When DB2 processes a plan that requires a tablespace scan, it performs prefetch activities, which are asynchronous readahead I/Os. When the storage array detects this sequential activity, it also performs prefetch by reading the table data into cache on the array. This puts the disk into a kind of streaming mode, notwithstanding interruptions from other I/O requests to the same physical storage media.

In streaming mode, spinning disks are actually as good, if not better than, solid-state disks. The gold tier might not be the best choice for this kind of DB2 workload. In fact, moving this data to the gold tier will not only fail to improve performance, it may take up valuable space that could be used for other I/O activity.

While selecting the correct tier for certain kinds of I/O activity may not be intuitive, an empirical measurement of the workload can lead to a more optimal tiering deployment.

Figure 1: The pattern of access to data, from creation to disposition, is usually characterized as the information life cycle.

Selecting the best workload for each tier

The best way to determine which data belongs on which tier is by analyzing Resource Management Facility (RMF) data for the tablespaces to determine I/O characterization. For example, when looking for data that might be suitable for SSDs, it is instructive to examine System Management Facility (SMF) 42 subtype 6 records, which show the tablespaces with a high DISCONNECT time. The high DISC time is usually an indicator that the page to be read was not in the storage controller cache for the DB2 synchronous read. Data sets with high average DISC times in the interval would generally be likely candidates for moving to SSDs.

But even if you identify data sets with a high DISC time (that is, that are getting storage cache “misses”), data sets that have the highest miss percentages are usually the largest data sets because they cannot be held in storage cache easily. What you really need is to understand the “miss density” of your data sets. The miss density is how many cache misses you get per gigabyte of tablespace storage. You can compute this value with some artful spreadsheet work using DCOLLECT and the SMF records.

Other metrics from the SMF reports can indicate data sets with minimal I/O activity. Those data sets could be candidates for the bronze tier of storage, especially if they are large.

Although it is not possible to go into all the details of this type of analysis in this article, some general rules can be derived:

  • Random read I/O activity is the best workload for the SSDs, especially the activity that results in a lot of storage cache misses.
  • Highly sequential activity is best for the FC or SATA drives
  • High write activity is best for FC drives without using parity protection. If the storage controller is cache-constrained, the high write activity could be directed at SSDs since the writes can be destaged to disk faster than with FC or SATA drives.
  • Low I/O activity is best for SATA drives.

Figure 2: Static tiering using SMS storage groups matches storage tiers to data characteristics.

Using DFSMS to create a manual storage tiering strategy

Ideally, storage mechanisms and databases would automatically work together to identify the best tier for each data, and then shift the data to that tier—but we’re not there yet. In the meantime, you can use IBM Data Facility Storage Management Subsystem (DFSMS) to deploy a rudimentary, manual tiering methodology. By constructing storage groups that contain volumes of similar performance characteristics, you can then use Automatic Class Selection (ACS) in combination with storage class settings to direct DB2 tablespaces to the most appropriate tier.

This is rudimentary for three reasons: First, the control of data set placement is at allocation time (when the data set is created). This does not take into account the data life cycle. What happens when the data is no longer accessed frequently and becomes a candidate for a lower-performing tier? Second, it assumes that you know tablespace performance requirements in advance. What if those requirements are unknown and the tablespace is inadvertently placed on the wrong tier? How do you move it? Third, maybe only some of the tablespace is “hot” and the rest is cold, so it should exist on multiple tiers.

A storage tiering solution

When tablespaces are known to have persistent, deterministic performance requirements, a static storage-tiering technique can be applied, perhaps using DFSMS. The overall goal is to match the storage tiers to the performance and I/O characteristics of the data. Figure 2 depicts a simple static tiering solution that could be implemented using Storage Management Subsystem (SMS) storage groups. In reality, the tiering model is rarely this simple due to the volatility of data access patterns.

Data movement between tiers

In any tiering model, there are a number of ways to move data from tier to tier transparently, without affecting database availability:

  • DB2 Reorg utility: Using the online reorg to move a tablespace from one disk to another is simple and effective. Controlling where it goes is not so easy, but can be achieved using guaranteed space or by changing ACS routines.
  • DB2 partitioning: Prudent choice of a partitioning key can allow the DBA to roll DB2 partitions for a tablespace from one tier of storage to another using the data set movement tools listed below.
  • Third-party tools: These include IBM Softek TDMF, IBM Softek zDMF, EMC z/OS Migrator, EMC Virtual LUN Migrator, EMC FAST, Hitachi Tiered Storage Manager, FDRPAS, and FDRMOVE. Note that when using these tools, you must validate which ones you need based on whether you need to move the data at the data set level or the volume level, and whether the DB2 tablespace or volume is actively being used.

Storage tiering: Heading toward automation

Storage tiering for DB2 for z/OS can certainly improve ROI and increase performance optimization, although it can be a burden for DBAs to manage the whole strategy. DBAs need intelligent storage systems that can automatically act on database environments and place data at the tablespace level (or at an even lower level of granularity) on the optimal storage tier based on user-provided policy settings. In the meantime, it’s good practice—and potentially lucrative—to familiarize yourself with the characteristics of the storage that you request or that you’re assigned, and think about the best ways to use it.