A Smarter Approach to Tackling Big Data

The new IBM PureData System for Analytics: Focusing on analytic performance and data center efficiency

Director of Offering Management, IBM Analytics, IBM

On February 5, 2013, at the PureSystems Executive Summit in Huntington Beach, California, IBM introduced the latest IBM® PureSystems™ models intended to help customers tackle big data. The latest in the PureSystems lineup comes from the IBM PureData™ family. PureData System for Analytics model N2001 joins the existing N1001 that was announced in October 2012. Model N1001 received a significant performance increase with up to 20 times improvement in concurrency and throughput. The N1001 offers a broader selection of smaller size configurations and a lower-cost entry point. However, the N2001 was specifically designed for the new era of big data. This system focuses not only on increasing analytics performance, but also on data center efficiency.

Why is data center efficiency important?

Companies today are increasingly dependent upon analytics. They want to leverage more data—and at the same time, they are dealing with extreme data growth. The New York Stock Exchange (NYSE), for example, sees its data double or triple in volume each year. For compliance reasons, it has to maintain seven years of history online and on premise. This would be a challenge anywhere, but it’s especially difficult in a place like New York City, where data center space is not exactly elastic. What companies like NYSE need is a solution that gives them not only exceptional analytic performance, but better data center efficiency as well.

Data center efficiency can mean a variety of things: controlling capacity and density, reducing power and cooling, or managing minimal resources. PureData System for Analytics N2001 is designed to help organizations take a smarter approach to efficiency.

What’s smarter about PureData System for Analytics?

The new PureData System for Analytics offers 50 percent greater capacity per rack than previous systems. When you compare rack-to-rack capacity with some competitors—Teradata, for example—PureData System for Analytics offers up to 160 percent more overall capacity. But despite this dramatic capacity increase, power and cooling requirements are no higher than in previous versions of the system. In fact, when compared to Oracle Exadata X3, PureData System for Analytics has better capacity per rack and uses 40 percent less power.

A Smarter Approach to Tackling Big Data – Figure 1

Organizations should also consider efficiency as well as performance when making investments in their analytic systems. So that users can be data experts, not database experts, PureData System for Analytics is powered by IBM Netezza® technology, which delivers an outstanding analytic performance system without requiring any database or storage administration. The latest version of PureData System for Analytics also provides a new and improved administration console and offers new capabilities for capacity planning.

And let’s not forget about performance…

The PureData System for Analytics is now many times faster than previous generations and boasts the best-in-class scan rate.* Scan rates are factors that limit how quickly data can be read and processed.

The base scan rate of Teradata systems is only about 38 GB/sec. Even with automatic compression activated out of the box, they produce an estimated two to three times the compression. Oracle’s disk bandwidth is even lower—about 25 GB/sec—and it relies on Flash to boost that bandwidth closer to 100 GB/sec. However, most data warehouse workloads won’t take advantage of Oracle Smart Scan technology, and Oracle does not have compression turned on by default. But as you can see in the following figure, IBM PureData System for Analytics offers the best base scan rate—128 GB/sec—with an estimated compression rate of four times at startup.

A Smarter Approach to Tackling Big Data – Figure 2

When you assign a dollar amount to the scan rate, things get even more interesting. If you’re still giving Oracle the benefit of the doubt with its Flash cache, consider this: it costs more than five times as much to scan the same gigabyte of data on the Exadata X3 than on the PureData N2001.

A Smarter Approach to Tackling Big Data – Figure 3

Of course, PureData System for Analytics is also an important component of the IBM big data platform. This platform offers a comprehensive solution architecture with deep integration to help tackle the full spectrum of big data needs. It’s the combination of these solutions that gives it such power: the ability to analyze data in motion and at rest, to put business intelligence into anyone’s reach, and to explore, discover, govern, and manage the data. Now that’s the smarter way to tackle big data!

What do you think? Let me know in the comments!

* Baseline scan rate based on documented out–of-the-box performance rates.

[followbutton username='nancykoppdw' count='false' lang='en' theme='light']
[followbutton username='IBMdatamag' count='false' lang='en' theme='light']