Analyze Data Where It Lives: On System z

Why you should consider moving the queries to the data

I can say, with a high degree of confidence, that your organization has at least one analytics application. You have one or more data stores that support this analytics workload: perhaps a data warehouse, some data marts, maybe an operational data store (ODS).

But here’s a question for you: where does the data in those data stores come from? What’s the source system? At many sites, the answer is “a mainframe.” Worldwide, a really big percentage of the foundational data for operational, run-the-business applications is stored in mainframe systems, often in IBM® DB2® for z/OS® databases.

For years, the conventional wisdom held that this data should be extracted and loaded into distributed systems servers for analytics purposes. That trend has lately been reversing, with the about-face driven by a simple but powerful concept: instead of moving mainframe data to query systems, why not move the queries to the data? Why not do your analytics processing on IBM System z®, where the source data lives?


It just makes sense

Several factors are behind this “back to the source” movement:

  • The desire for infrastructure simplification. Once you’ve started down the “move the data” path, query systems can proliferate and the resultant source-to-target data propagation infrastructure can end up looking like a spider web. Managing all those data flows can consume a lot of server, disk, network, and staff resources. On top of that, having multiple query systems can introduce data quality control and security issues—particularly when some of those systems are managed at a departmental level, outside the IT organization. Moving the queries to the data on System z means fewer disparate data copies, and that can lower costs and reduce risk.
  • The reclassification of analytics applications as tier-one systems. The term “tier one” is often used to denote systems that are of greatest importance to the business—the systems that require maximum uptime. Analytics, once a capability that was simply nice to have, is increasingly a must-have application. When business intelligence capabilities are seen as mission-critical, the mainframe’s rock-solid reliability makes it a prime analytics platform.
  • The demand for ever-greater currentness with regard to decision-support data. It was once standard practice to update data warehouses nightly, and in many cases that is still a desired and reasonable approach. However, it is increasingly common for analytics application users to demand that source data changes be reflected almost immediately in the data they query. Placing a data warehouse on the same mainframe system as the data on which it is sourced facilitates near-real time replication of source data changes.

These factors alone would drive the growth of analytics workloads on System z. But there’s more to this story.


Sweetening the deal: Outstanding analytics capabilities

The favorable winds blowing business intelligence applications and data back toward System z are complemented by recent hardware and software technology enhancements that have made the mainframe a better-than-ever analytics platform. DB2 for z/OS supports online analytical processing (OLAP) specifications in SQL statements, provides index-on-expression functionality to help speed the execution of complex queries, and features a SQL optimizer that can automatically rewrite a complex query to generate the associated result set more efficiently.

Linux on System z is also soaring in popularity as more organizations see the performance and cost-savings benefits of running tools such as IBM Cognos® Business Intelligence (for reporting, scorecarding, and dashboarding) and IBM SPSS® (for predictive analytics) on the same “box” as a target DB2 for z/OS database. DB2 10 for z/OS can work with SPSS for Linux on System z to enable real-time scoring of transactional data updates. In addition, the IBM DB2 Analytics Accelerator extends the capabilities of a DB2 for z/OS system, delivering outstanding throughput and response time for both high-volume operational analytics queries and complex SQL statements that involve huge data scans (“shockingly fast,” as an IT manager friend of mine put it).

That’s quite an analytics package—and these are just a few of the highlights. Here’s your key takeaway: if you are feeding data from a DB2 for z/OS database into an off-mainframe query system, you have all kinds of reasons—among them quality of service, data control, infrastructure simplification, and cost efficiency—to look at querying that data, for decision support purposes, on the platform of origin (System z). Lots of organizations have gone that route, and they’re reaping the benefits. The data analytics game is serious business. What’s your move?

Please share any thoughts or questions in the comments.