Blogs

Embrace the Change: Part 2

How IBM Netezza technology removes the intimidation of artificial constraints

Senior Principal Consultant, Brightlight Business Analytics, a division of Sirius Computer Solutions

Part 1 of this two-part series discussed the stresses on infrastructure, people, policies, processes, and protocols when the IBM® PureData™ for Analytics powered by IBM Netezza® technology first lands on the raised floor. We may know how to play the game. We may be game masters, but the game is before us with the game board and pieces already in place. We are constrained to play the game within the confines of these parameters. But confinement and constraint are not wanted. A technology that requires shoehorning the business model into it is not a leading-edge approach. Not only does it limit our creative capacity, it essentially forces us to solve problems like our competitors because they’re using the same technologies too. Instead, the capability to rapidly adapt to the storm-tossed business world is the way to go.

Data scientists as hired guns

In movies depicting the Old West, the townsfolk often encounter a problem with a rogue gang or outlaws and call for a gunslinger to clean things up. The gunslinger generally shows up at dawn, long overcoat flowing in the breeze, and with much-better background music accompaniment than the antagonists get. The gunslinger has his own way of solving problems and no constraints when it comes to taking law-enforcement action the townspeople will not or cannot do. The outlaws are accustomed to terrorizing law-abiding citizens. The gunslinger is law abiding but has the experience to know where the boundaries are drawn. As the showdown nears, the outlaws are about to find out that laws are in place for everyone’s protection, and operating outside of the law is not freedom—it’s a liability. Bringing this portrayal into today’s technology, the gunslinger is the data scientist, and the laws of mathematics can bring those outlaws to justice. The outlaws are using the system in their favor only, sort of like how the organizational vice president manipulates reports to make the division look good to management. Now, a data scientist armed with a slide rule and an abacus is not especially intimidating. But give the data scientist a massively parallel machine that is purpose-built for analytics, and this configuration may as well be a double-barreled plasma gun with heat-seeking and target-homing options. The outlaws are toast. Consider the following artificial constraints that are no longer daunting for the gallant, data scientist hero:

Indexes

With practically every push of functionality in a traditional technology, index maintenance is a bane. Queries can be driven only with the indexed columns. This limitation affects loading and maintenance durations because the larger the tables become, the more overhead rises with index maintenance. In the Netezza appliance, no indexes exist and all columns are fair game for analytics.

Scalability

Absolutely nothing beats the raw scalability of a Netezza machine. It is well suited for supporting split-second turnaround against tables with a billion rows. For example, we have a client organization with more than 100 billion rows in a single table, and none of the queries exceed five seconds in duration—over 90 percent of them are less than two seconds. From whence cometh this power? It comes from a purpose-built, massively parallel machine that can inhale and exhale Libraries of Congress–worth of data at a time.

Administration

The ease of administration of a Netezza appliance is an enormous strength. Development leads or operators can now perform many of the tasks that normally require a database administrator (DBA). The hard part is already solved under the hood, and the hood is bolted down. No peeking.

IT staff iterations

Another bane of traditional analytics is the constant plumbing of the IT staff to support mundane tasks. This intermittent iteration keeps the solution in varying states of instability and stunts the scientist’s agility. With Netezza, a more decoupled architecture can be deployed deliberately and only require this interaction on strategic boundaries rather than tactical ones.

Slow model deployment

Traditional environments have applications inextricably intertwined with the data model such that even minor changes appear at a glacial pace. Netezza not only breaks the ice, it shapes the ice into sculptures and uses the scraps for cooling drinks. Put that ice to work. In the meantime, data model deployment achieves the iterative agility everybody wants but others may not deliver. Structures and architectures—for example, object-relational—that are slow or even impossible in traditional platforms, are common fare in the Netezza environment.

Self-contained experience

The traditional environment requires infrastructure that includes DBAs and oversight. The data scientist needs to remain self-contained and move quickly so that the problem solving can flow. The concept of flow has been around for a while. Achieving it is hard without a self-contained experience. Netezza enables and accelerates data scientists by providing all the power and function they need under the covers.

Analytics at the speed of thought

When biologists analyze the human body and the millions of signals required to do even simple tasks, they imagine that the human thought process, if bound in a normal computer, actually exceeds the speed of light. Even if this idea is only philosophical, it explains why a machine can never be too fast, too scalable, or the turnaround on a query too quick. Data scientists need to pump questions as fast as they arise—no delays, no excuses.

Data scientists as game changers

Today’s data scientists coupled with the power of PureData for Analytics powered by Netezza technology are armed and ready to play the game, change the game, or change the playing field entirely. Tomorrow’s data scientists will appear against a horizon brightened by the morning sunrise and accompanied by rising background music, silhouetted alongside a PureData Analytics powered by Netezza machine. But they can leave the long overcoat at home. It will be a nice, sunny day. Please share any thoughts or questions in the comments. [followbutton username='enzeevoice' count='false' lang='en' theme='light']

 <table cellpadding="0" cellspacing="0" valign="top" width="15%>

  [followbutton username='IBMdatamag' count='false' lang='en' theme='light']