More Musings on Informix 11.7
The best database keeps getting better
First of all I would like to welcome Art Kagel, who is co-writing with me for this issue. Art and I have worked together for many years, and in January 2010 he joined Advanced DataTools to focus on Informix database development, support, and training. We worked together, evaluating and testing Informix 11.7 server (formerly code-named Panther) during the beta program. In this column, we wanted to share some more thoughts that we didn’t have room for in the main article.
Optimizing star schema
Informix has always been a great datamart and data warehouse server. Four enhancements in 11.7 make Informix a real contender for larger warehouses and data vaults. Some of these enhancements are code merged from the old Informix Extended Parallel Server (XPS) engine, and some are new features. The new features are:
- Elimination of indexes on foreign keys
- Forest of Trees indexes
- Multiple index scans
- Star schema optimizer support
The elimination of indexes on foreign keys helps when you have a fact table with billions of rows and each added index would slow down inserts and loads. Forest of Trees indexes combine the advantages of a hash index with those of a traditional B+tree index when the first column of the index is not very selective. Multiple index scans in the optimizer are a real benefit for data warehousing, as they can reduce the number of long key compound indexes, which can also improve insert, delete, and update performance. These index scans also help online transaction processing (OLTP) databases and are described in much more detail in the main article.
However, we want to spend a bit more time on star schema optimizer support, which is an offshoot of XPS. In 11.7, when you have many dimension tables, you can create a single compound index on the fact table containing all of the dimension table keys. The optimizer will recognize this index and use the filters on the dimension tables to generate a list of valid combinations of the dimension tables’ keys into a temp table. The optimizer will then join that temp table to the fact table using the compound index to quickly locate the rows that satisfy the criteria on the dimension tables. We have tested this one quite a bit and the results are impressive.
Finally, here’s one we haven’t tried yet: the documentation indicates that the star schema optimizer can also combine with multi-index scan technology if an appropriate compound index is not available. In this case, the optimizer uses another technology from XPS known as push-down hash joins to perform the query while examining fewer rows than would be possible using older complex query processing techniques. In this case, the first column of any index, simple or compound, is a candidate to satisfy the star join. Informix 11.7 contains an exciting new release of this feature for those of us who have been implementing Informix as a data warehouse server.
No more extent worries
There are two problems with tables that have lots of extents, or at least there used to be. Prior to 11.7, if your tables had close to 200 extents (the exact number varies based on page size, the number of “special columns,” and the number of attached indexes in the table), you had to be concerned that you might fill the table’s extent list and the table would no longer accept inserts or the expansion of variable length rows. That problem is gone. Informix 11.7 extends the table’s extent capability to support more than 32,000 extents, making the extent list virtually unlimited due to the effects of extent doubling.
The second problem, of course, is that a table or an index with many highly active extents will suffer from performance degradation due to increased disk head movement and increased contention for head positioning with other objects in the database. Before the 11.50xC4 release, you had only a few options to reorganize a table with many active extents (and fewer for highly fragmented indexes), and all of them required that the table be taken offline for a time. You could unload the data, and then drop and re-create the table with a larger extent size; you could create a CLUSTER index on the table; or you could perform an ALTER FRAGMENT ... INIT on the table or index.
When the compression feature was added in 11.50.xC4, it brought along with it the REPACK and SHRINK SQL application programming interface (API) commands. This provided some improvement on the table reorg problem, because you could use these options to perform a partial reorganization in the background without taking the table offline. However, the REPACK option does not necessarily reduce the number of extents. It simply moves all of the data to the earliest extents possible in the existing extent list, and the SHRINK option will free up any extents that are unused after the REPACK (including any unused portion of the last partially used extent). Since most tables’ largest extents are the last ones added and the smallest ones are the first added, the compression feature doesn’t help much.
Informix 11.7 introduces the new DEFRAGMENT SQL API option, which moves data to new extents, creating contiguous extents that can be coalesced into a single larger extent wherever possible. This can all be done while the table or index is online. Since the engine performs this operation a page at a time as many very small transactions, very little contention is created. There is no possibility of a long transaction rollback, and best of all, since the pages are moved whole and the order of pages in the table doesn’t change, each row’s address does not change in the table—so no index updates are required. This operation is very fast, even compared to a REPACK.
At last, a true low-overhead online table reorg capability. Also, you can stop the operation at any time and restart it again later. Here’s all you need to do to reorg your table, index, or table/index fragment into the minimum number of extents possible:
EXECUTE FUNCTION TASK( 'DEFRAGMENT', 'database:owner.tablename' );
EXECUTE FUNCTION TASK( 'DEFRAGMENT PARTITION', <partnum> );
We want to wrap up by touching briefly on the new storage provisioning features. We could go on for pages, but the important thing is that you can now configure your Informix instance so that you never again need to worry about a dbspace running out of storage space. You can configure it so that either existing chunks are extended, or so that a new chunk will be added to a dbspace when it fills. You define a pool of storage, which can be RAW, COOKED, or filesystem files or even a directory name (the engine will create and allocate new files on its own within those directories). Once you have defined the storage pool, you can either use the pool when you manually add dbspaces or chunks or extend existing chunks, or you can define rules that the engine will use to automatically extend chunks or expand dbspaces with new chunks when the free space in the dbspace drops below the thresholds you define.
As John Miller III, STSM, embedability architect from IBM, says, “Informix will do it all while you are watching your favorite TV show!”
|[followbutton username='IBMdatamag' count='false' lang='en' theme='light']|