Blogs

Applied Intelligence

Develop smart and put DB2 to work

Even if you have a team of very talented application programmers, they can still be doing the wrong kind of work and reducing the value that they deliver to the organization. One glaring example that I often see: using application code to do things that DB2 can do more effectively.

In the 1980s, when DB2 was brand new, a lot of application developers were learning how to program using SQL (an IBM invention, by the way). Mastery takes time, but we’re now 27 years after the introduction of DB2, and there are still programmers who are joining tables in their application code and failing to take advantage of the set-oriented nature of SQL. In other words, they’re doing in their application code what DB2 should be doing. In this column I’ll try to make the case for allowing DB2 to do all the work that it can for you.

Getting more value from programmers’ time

Nobody likes to see a development group reinventing the wheel. When a programmer codes a useful routine in such a way that other programmers can’t readily discover its functionality or easily use it, the result is needless duplication of effort as the others end up coding the same logic in their programs from scratch. It can also expand application maintenance time down the line—which then cuts into time for new application development—because similar functionality has to be updated in different programs.

DB2 can be a big help here. There are multiple ways to place logic within the DB2 level of an application, making the logic available to any program that accesses the database. For example, a trigger can cause an action (most anything that can be expressed in SQL) to be taken automatically in response to a row being inserted into, deleted from, or updated in a table. A trigger could be used to ensure that an employee’s salary is within an allowable range by signaling an error if an update is outside that range. Triggers can also invoke DB2 stored procedures and user-defined functions, which are themselves other ways of encapsulating logic in the form of DB2 objects. A trigger could invoke a stored procedure that would issue a reorder for an item in inventory if an update of a row in a PARTS table were to take the number of in-stock items below a specified threshold.

The more data-centric logic is implemented in the DB2 database, the more programmers can focus on writing code that directly addresses the business needs of the organization. If there is data-related logic that you think could be broadly applicable in your enterprise, talk to a DB2 DBA about getting that deployed in the database layer of the application system. You could end up making every programmer in your organization more efficient and productive.

Future-proofing your applications

Over time, databases change. They frequently get larger, and characteristics of the data stored in the database shift. For example, a once almost-unique column in a table might come to have a large number of duplicate values. You want your applications to perform consistently through such changes, but this is tough to achieve if your programmers are doing work that DB2 should be doing.

Take the example of joining tables in application code. This means that the programmers are determining the data access path. Instantly, you have a problem: how do you know that they’ve made the right decision? Coding a cursor on table A, fetching a qualifying row, and looking for a match in table B is, in essence, what we call a nested loop join. How do you know that the nested loop method is the right way to join these tables? What about a merge join, a hybrid join (DB2 for z/OS), or a hash join (DB2 for Linux, UNIX, and Windows)?

When DB2 performs the table join, the SQL optimizer determines the lowest-cost means of generating the result set, based on statistical information stored in the DB2 catalog. The DB2 Optimizer is very good at what it does, having been continually enhanced over the 30 years since IBM invented cost-based SQL statement optimization.

When the database changes over time, you get a second problem. Perhaps the database grows substantially, the query result set is much larger than before, and a different table join method will deliver better performance than the method initially selected. If DB2 is handling the join, the optimizer will automatically adjust the access path. Do the join in application code, and you either live with deteriorating performance, or you rewrite the program.

Similarly, programmatic joins can prevent—or complicate—database maintenance. Suppose that a table join performed through program code depends on the existence of a specific index. And suppose that index needs to be removed because the table has accumulated so many indexes that insert and delete operations are costing too much. (This happens all the time in database-land.) If DB2 were handling the join, it could switch to a different join method and deliver acceptable performance. But a hard-coded join needs to be rewritten. You can’t change a join method hard-coded into a program (not without rewrite effort), so the programmatic join may stand in the way of DBAs making a physical database change that could reduce run times for critical data-change operations.

Here’s the bottom line: when you let DB2 do as much as it can, you leave the access path selection process up to DB2. That, in turn, helps keep application performance consistent as the database changes over time. On top of that, consider that your organization is paying for the sophisticated SQL optimization technology built into DB2. Use it!

Enhancing CPU efficiency of applications

When a program issues an SQL statement, the statement must get from the program to DB2, and after the statement is executed, control must return to the application code. This program-to-DB2 round-trip is not free. Reducing the number of SQL statements in application code reduces the cumulative CPU cost of trips across the application program–DB2 boundary. The effect of getting the job done with fewer trips to DB2 can be significant, and here it’s important to focus not on a single unit of work but on an overall workload. A seemingly small difference in CPU time per transaction can end up being a big deal when hundreds of transactions execute per second, or when a batch job has to work through hundreds of thousands of records in an input file.

To really get things tight from a CPU efficiency perspective, developers need to keep up with, and be prepared to take advantage of, new DB2 features and functions that can reduce chattiness between an application program and DB2. Consider, for example, use of the MERGE statement, with which one can make changes to a table based on a set of input records, updating target table data when there is a match with an input record and inserting a new row into the table when there is not. That’s a CPU saver compared to the old methodology of doing a SELECT against the target table to see if there is a row that matches an input record and then driving an UPDATE (if there is a match) or an INSERT (if there’s no match). Similarly, using multi-row INSERT (sometimes called block INSERT) to place several new rows in a table with one INSERT is a CPU saver, compared to the one-at-a-time way of doing things. Look for ways to accomplish data retrieval and changes with fewer SQL statement executions, and you’ll reduce the load on your DB2 server.