Press the Answer Button
We go deep inside the Netezza data warehouse appliance to find out what makes it tick
A refrigerator isn’t complicated to use. Neither is a dishwasher nor a clothes dryer. These appliances have one or two simple controls, they don’t need a lot of maintenance, and they do one thing better than any other type of device.
Making a data warehouse that is so focused and low maintenance that it can fairly be called an appliance is no easy task, but the IBM Netezza data warehouse appliance has proven that this approach works well.
The best way to really appreciate an appliance is to pull it apart and see what makes it go. In this article, you’ll take a tour of the Netezza appliance from the inside out and learn some of the ways that it advances the state of the art in data warehousing and analytics.
How to design an integrated appliance
The Netezza appliance integrates database software, servers, and storage into a single system. Data is loaded through standard extract, transform, and load (ETL) connectors, and is then available for queries from business intelligence and analytics applications.
Inside the appliance are two major components: the host servers, which manage the connection between the appliance and the outside world, and the Snippet Blades (S-Blades), which handle the database heavy lifting.
The host servers use symmetric multiprocessing (SMP) and run a SQL relational database management system (RDBMS) on Linux. Their active/passive configuration provides full redundancy should one server fail. Whichever server is active presents a standardized interface (ANSI SQL, JDBC, ODBC) to external tools and applications. It also compiles SQL queries into executable code segments called snippets, distributes the snippets for execution, and creates optimized query plans.
The S-Blades are processing nodes that consist of an independent server containing several multicore CPUs, multiengine field programmable gate arrays (FPGAs), and RAM. Depending on its size, a single Netezza appliance can hold more than a hundred S-Blades. Together, the fleet of S-Blades represents a massively parallel processing (MPP) grid behind an SMP front end. This two-tier architecture is called Asymmetric Massively Parallel Processing (AMPP), as shown in Figure 1.
The host servers and S-Blades are connected to a series of high-density, high-performance, highly redundant, independent disk arrays. A high-speed network fabric running a customized IP-based protocol and optimized to scale to more than a thousand nodes connects all of the hardware components. Each node can transfer data sets simultaneously to every other node. All of the hardware components are hot-swappable.Figure 1: Netezza AMPP architecture
Processing a query
The action begins when an application sends a query to the appliance host server, which compiles the query and creates an optimized query execution plan. That may sound like a familiar process, but the Netezza appliance takes some very unfamiliar paths.
First, the Netezza database management system (DBMS), a highly optimized engine for the massively parallel Netezza environment, does not use indexes. The logic behind this design decision is that indexes add value only when the DBA can predict which queries will be run against the database. For truly exploratory analytics work, such predictions can be nearly impossible, as data analysis is iterative and unpredictable: ask a question; consider the response; ask a better question. The only reason to repeat questions exactly is to confirm or check the results of the analysis.
Instead, the optimizer gathers up-to-date statistics on every database table referenced in a query. The optimizer combines these statistics with detailed performance knowledge of the specific system components, making it possible to accurately measure the disk, processing, and network costs associated with an operation. The resulting query plans are based on data, rather than heuristics, and are thus extremely efficient. By using these statistics, the optimizer minimizes disk I/O and data movement, two factors that reduce performance in a data warehouse system. Another critical benefit of the non-indexed approach is that the Netezza appliance doesn’t need to be tuned at all.
As part of the transformation, the optimizer also determines the correct table join order, rewrites expressions, and eliminates any redundancy from the SQL. The optimizer can use its knowledge of the appliance components to determine the best join order in a complex join. For example, when joining multiple small tables to a large fact table, the optimizer can choose to broadcast the small tables in their entirety to each of the S-Blades, while keeping the large table distributed across all snippet processors. This approach minimizes data movement while taking advantage of the AMPP architecture to parallelize the join.
Converting and broadcasting the query
After the optimizer has done its work, the compiler converts the query plan into executable code segments called snippets. When compiling queries, the host server uses a feature called the object cache to accelerate query performance. This is a large cache of previously compiled snippet code that supports parameter variations. For example, a snippet with the clause WHERE name = 'bob' might use the same compiled code as a snippet with the clause WHERE name = 'jim' but with settings that reflect the different name.
Snippets are then sent to the S-Blades to be executed in parallel. To ensure efficient handling, a scheduler runs on the host servers, balancing execution across complex workloads to meet the needs of multiple users, while maintaining high utilization and throughput. The scheduler considers a number of factors—including query priority, size, and resource availability—in determining when to execute snippets on the S-Blades. Like the optimizer, the scheduler uses the appliance architecture to gather metrics about resource availability from each component of the system. Sophisticated algorithms enable the scheduler to maximize system throughput, utilizing nearly 100 percent of the disk bandwidth, while ensuring that memory and network resources are not overloaded—a common cause of thrashing. When the scheduler gives the green light, the snippet is broadcast to the snippet processors, which reside on the S-Blades.
In addition to the scheduler on the host servers, the snippet processors have their own smart preemptive scheduler that allows snippets from multiple queries to execute simultaneously. To decide when and for how long to schedule a particular snippet for execution, the scheduler takes into account the priority of the query and the resources set aside for the user or group that issued it.Figure 2: Inside a Netezza S-Blade, showing the software engines in the FPGA
Hot hardware SQL action at the S-Blade
Each snippet sent by the host server arrives at the S-Blade with two elements: compiled code to be executed by individual CPU cores, and a set of FPGA parameters used by four virtual engines that accelerate the overall processing of the query (see Figure 2). The Compress engine decompresses data streamed from disk. The Project and Restrict engines then reduce the scope of data that the rest of the query will act on, first eliminating columns not specified in the SELECT or JOIN clauses, and then removing rows based on restrictions specified in the WHERE clause. Finally, the Visibility engine filters other rows that should not be “seen” by a query—such as those locked by a transaction not yet committed—and maintains atomicity, consistency, isolation, and durability (ACID) compliance.
A processor core then picks up the uncompressed, filtered data block and performs fundamental database operations such as sorts, joins, and aggregations. It also applies any complex algorithms embedded by an analytics application. All this happens in parallel on hundreds of snippet processors, depending on the model of the appliance. Every node then sends its result over the network fabric to other S-Blades or the active host server, which communicates the resulting data set out from the appliance to the requesting application.
All snippet processors now have snippet results that must be assembled. The snippet processors use the intelligent network fabric to communicate flexibly with the host servers and with each other to perform intermediate calculations and aggregations. The host server assembles the intermediate results received from the snippet processors, compiles the final result set, and returns it to the user’s application. Meanwhile, other queries are streaming through the system at various stages of completion.
One appliance, many answers
The Netezza system combines advanced hardware technology with careful study of how to maximize database performance in a purpose-built and optimized environment. The result is truly an appliance: simple to operate, easy to maintain, and better than just about anything else at what it does. Need answers? Just drop in your data and press the button.
Netezza in retail
Retail can be a tough market for data warehouse technology. Retailers operate on thin margins, and when they invest in technology they expect it to work—no ifs and no buts. Their industry is highly competitive, so to succeed the warehouse must help the business by bringing new, value-creating applications rapidly into production.
Many data warehouses don’t fully succeed: in their most recent Magic Quadrant for Data Warehouse Database Management Systems published on January 28, 2011, Gartner analysts estimate “Nearly 70% of data warehouses experience performance-constrained issues of various types.”1 Warehouses running on older database technology make it difficult, not easy, to develop new applications: 57 percent of respondents surveyed for a report published by The Data Warehousing Institute (TDWI)2 in July 2011 expressed frustration with backlogs of requests for new business intelligence applications and IT’s inability to satisfy new requests in a timely manner.
Retailers have been quick to embrace the Netezza appliance. “We are a very data-driven company, and we need to make information available to the business faster to drive decision making. We were experiencing performance issues with our legacy BI [business intelligence] platform and were looking to speed up our ability to crunch through and analyze the vast amounts of new data we generate daily,” says Ed Macri, vice president of advertising and business intelligence at CSN Stores. With the Netezza appliance, CSN can answer in seconds or minutes questions that previously took hours or days.3 At another large U.S. retailer, slow data load jobs were causing problems, because their indexes needed to be destroyed and rebuilt each time new data was loaded. The Netezza appliance does not use indexes, and the retailer was able to reduce load times from more than five hours to less than three minutes. In these and many other cases, Netezza is helping retailers get answers faster.
1 Complimentary copies of the Gartner report are available at netezza.com/gartnermq/2011.
2 Claudia Imhoff and Colin White, Self-Service Business Intelligence: Empowering Users to Generate Insights (The Data Warehousing Institute, July 1, 2011).
3 IBM, “CSN Stores Enhances Shopping Experience with IBM Data Warehouse Appliance,” news release, January 11, 2011.