Blogs

Adding and Removing Data Partitions

An effective way to migrate large blocks of data

DB2 for LUW Offering Manager, IBM

In my last column, I explained how you can use table partitioning to organize data across multiple storage objects based on values in one or more columns. I also described two ways to use the PARTITION BY clause of the CREATE TABLE statement to create partitioned tables.

One advantage to using partitioned tables is that only relevant data partitions are accessed during query execution; because the DB2 Optimizer is data-partition aware, only relevant data partitions are scanned to resolve a query, resulting in fewer I/Os and higher query performance. Another advantage is that new partitions can easily be added to increase the table range or existing tables containing data can easily be attached (rolled in), while partitions containing old or obsolete data can be removed (rolled out) and archived, for example, to meet company policies or federal laws that mandate record retention.

In this column, I’ll walk you through the process of adding and removing data partitions, and I’ll show you how enhancements made in DB2 9.7 can make the process easier and faster.

Adding new partitions to a partitioned table

There are two ways to add new partitions to a partitioned table: you can add new ranges and increase the capacity of a table by adding one or more empty partitions, or you can add existing, populated tables by converting them into new partitions. As you might imagine, the process used depends on which approach you choose. A new, empty partition can be added to a partitioned table by executing the ALTER TABLE statement with the ADD PARTITION option specified; the syntax used looks like this:

ALTER TABLE [TableName]
ADD PARTITION <PartitionName>
<STARTING <FROM> [Start | MINVALUE | MAXVALUE] |
STARTING <FROM> ([Start | MINVALUE | MAXVALUE] ,...)
<INCLUSIVE | EXCLUSIVE>>
ENDING <AT> [End | MINVALUE | MAXVALUE] |
ENDING <AT> ([End | MINVALUE | MAXVALUE] ,...)
<INCLUSIVE | EXCLUSIVE>
<IN [TSName]>
<INDEX IN [IndexTSName]>
<LONG IN [LongTSName]>

where:

  • TableName: Identifies the partitioned table, by name, that a new partition is to be added to
  • PartitionName: Identifies the unique name, if any, to be assigned to the partition to be added
  • Start: Specifies the low end of the range for the data partition
  • End: Specifies the high end of the range for the data partition
  • TSName: Identifies the table space in which the new partition is to be stored
  • IndexTSName: Identifies the table space in which any partitioned indexes for the new partition are to be stored
  • LongTSName: Identifies the table space in which the values of any long columns are to be stored

Note: Parameters shown in angle brackets (< >) are optional; parameters or options shown in normal brackets ([ ]) are required and must be provided. A comma followed by ellipses (...) indicates that the preceding parameter can be repeated multiple times.

Thus, if you wanted to add an empty partition to a partitioned table named SALES, you could do so by executing an ALTER TABLE statement similar to this:

ALTER TABLE sales ADD PARTITION q4_sales
STARTING '10/1/2010' ENDING '12/31/2010' IN tbsp3;

When such a statement is executed, a new table is created and logically made a part of the partitioned table. The system catalog table SYSCAT.DATAPARTITIONS is then updated to reflect the change.

Attaching a populated table to a partitioned table

Tables that already contain data can be added to a partitioned table as new partitions by executing the ALTER TABLE statement with the ATTACH PARTITION option specified. The syntax used for this type of operation looks like this:

ALTER TABLE [TableName]
ATTACH PARTITION <PartitionName>
<STARTING <FROM> [Start | MINVALUE | MAXVALUE] |
STARTING <FROM> ([Start | MINVALUE | MAXVALUE] ,...)
<INCLUSIVE | EXCLUSIVE>>
ENDING <AT> [End | MINVALUE | MAXVALUE] |
ENDING <AT> ([End | MINVALUE | MAXVALUE] ,...)
<INCLUSIVE | EXCLUSIVE>
FROM [SourceTable]
<BUILD MISSING INDEXES | REQUIRE MATCHING INDEXES>

where TableName, PartitionName, Start, and End are the same as before and SourceTable identifies the base table that is to be used as the source of data for the new partition.

So, if you wanted to attach a base table named Q4_2010_SALES to an existing partitioned table named SALES as a new partition, you could do so by executing an ALTER TABLE statement that looks like this:

ALTER TABLE sales ATTACH PARTITION q4_sales
STARTING '10/1/2010' ENDING '12/31/2010'
FROM q4_2010_sales;

After a base table is successfully attached to a partitioned table, the system catalog is updated to reflect the newly attached partition while information about the original base table is removed from the catalog. No data movement is involved. This means that the process of attaching a new partition takes place very quickly—typically within a few seconds. (It’s important to note that to attach an existing base table to a partitioned table, the base table’s design must be similar to that of the partitioned table; see sidebar, “Conditions for attaching a table as a new partition.”)

Anytime a base table is attached to a partitioned table, the partitioned table is automatically placed in “Set Integrity Pending” state. This means that integrity checking must be performed immediately after a new partition is attached. To check integrity after adding a base table named Q4_2010_SALES to an existing partitioned table named SALES (as a new partition), you would need to execute a SET INTEGRITY statement that looks similar to this:

SET INTEGRITY FOR sales
ALLOW WRITE ACCESS
IMMEDIATE CHECKED;

Removing partitions

Just as you can add or attach new partitions to a partitioned table, you can also remove existing partitions. (Removed partitions become regular, stand-alone base tables.) Partitions can be removed by executing the ALTER TABLE statement with the DETACH PARTITION option specified; the syntax for this statement looks like this:

ALTER TABLE [TableName]
DETACH PARTITION <PartitionName>
INTO [TargetTable]

where:

  • TableName: Identifies the partitioned table, by name, that a partition is to be removed from
  • PartitionName: Identifies the partition, by name, that is to be removed
  • TargetTable: Identifies the base table where the partition’s data is to be stored

Thus, if you wanted to remove a partition named Q1_SALES from a partitioned table named SALES and store its data in a table named Q1_2010_SALES, you could do so by executing an ALTER TABLE statement that looks like this:

ALTER TABLE sales
DETACH PARTITION q1_sales
INTO q1_2010_sales;

Once the table Q1_2010_SALES becomes available, you can either drop it (to delete the data) or attach it to another partitioned table (for example, to archive it).

Conditions for attaching a table as a new partition

Before an existing base table can be added (attached) to a partitioned table, the following conditions must be met:

  • The source table must be a nonpartitioned table or a partitioned table that has only one partition and no attached or detached partitions.
  • The source table cannot be a typed table or a range-clustered table.
  • The source table must be droppable.
  • The source and target table definitions must match:
  • The number of columns must be the same.
  • The data types of each column must be the same. (For columns with XML or LOB data types, INLINE LENGTH values must be the same.)
  • The nullability characteristics of each column must be the same.
  • Any default constraints defined for columns must be the same.
  • Source and target table compression specifications must match.
  • Characteristics (table space type, page size, extent size) of the table spaces used by the source and target tables must match.
  • If the target table is distributed across database partitions, the source table must also be distributed using the same distribution method.
  • If the target table is a multidimensional clustering (MDC) table, the source table must also be an MDC table.
  • XML data formats used by the source and target tables must be the same.

DB2 9.7 table partitioning enhancements

Prior to DB2 9.7, indexes for partitioned tables were nonpartitioned. This meant that whenever partitions were added or removed, corresponding records had to be added to or removed from any indexes that existed for the table.

With DB2 9.7, partitioned indexes are now supported, allowing each data partition to be indexed separately. This provides several benefits, including improved performance when adding or removing partitions and the ability to perform select maintenance operations at the partition level. Starting with FixPack1, you can perform REORG operations on a single data partition rather than on the entire partitioned table, resulting in faster REORG operations while increasing availability to other data partitions.

Other table partitioning enhancements introduced with DB2 9.7 include XML support (available with FixPack1 or greater) and higher availability during ALTER TABLE ... DETACH PARTITION operations—the removal of a partition no longer requires that all current transactions to the table be completed, allowing long-running queries to continue without impeding the partition removal process.

Partitioning for performance

Data partitioning provides many useful benefits, particularly when used in data warehouse and decision support environments. By taking advantage of these features, you can manage data more effectively, while delivering optimum database performance.

Special thanks to Mike Winer, senior technical staff member–DB2 kernel architect, and Liping Zhang, DB2 data partitioning development lead, for providing information that was used to develop this article.

Resources