Blogs

Post a Comment

Data: To have and to hold?

April 8, 2014

Or to analyze, act and then move on?

It’s time for a more open relationship. Now announcing Streams v3.2.1.

Till death do us part is the goal in our personal relationships, but not the ideal relationship we want with our data. Why is that? Remember: more data means more cost and compliance pain, as well as more data to sift through to find the real gems of insight. Once dashboard overload sets in, the data relationship is surely doomed.  

Practically speaking, it’s impossible to save and analyze all data for historical or batch analysis since data volumes double every year and many kinds of data, like machine to machine data, can’t easily be persisted. Also, sometimes even a minute is just too late! By the time you save and analyze the business opportunity is gone forever, arbitrage opportunities are often fleeting.

How can you analyze and act at the right moment without the heavy burden of a lifelong commitment to your data (sorry data)? InfoSphere Streams, part of IBM Watson Foundations, has been the go-to technology for analytics of data-in-motion. InfoSphere Streams analyzes data in memory and delivers actionable insight in real time. 

Stream computing is a different paradigm. Traditional techniques for data analysis use queries to pull the data from a data storage device such as a data warehouse or database, which is still valid for many requirements. The new stream computing paradigm brings data to the query, where data is pushed, or flows, through the analytics. 

So what’s new? On April 8, IBM announced InfoSphere Streams v3.2.1. One of the most exciting parts of this announcement is our new open relationship. IBM has decided to create an open source project for some Streams components to speed development of applications and harness the energies of the development community. In future releases, we expect to incorporate new functions from the projects into the InfoSphere Streams product. Check out IBM Streams on GitHub to start coding.

You will find the internet and messaging toolkits to help you get data from internet sources, and also send and receive messages via several protocols like JMS, MQ and MQTT. You will also find the new YARN resource manager which allows Streams to have its resources managed by open source YARN, a key component of Hadoop v2. The goal is to facilitate sharing big data clusters between Hadoop and Streams to complement our existing support using Platform Symphony resource manager.

In addition to being more open, InfoSphere Streams v3.2.1 delivers support for the latest big data platforms including Hadoop 2.0.x and Hadoop 2.2.x. Enhancements to the machine data accelerator and social data accelerator help you extend and consume analytic insight. Extended R Analytics support and real time scoring across InfoSphere Streams and InfoSphere BigInsights give you deep analytic across data-in-motion and data-at-rest.

Analysts are also buzzing about the news. Read what Enterprise Management Associates are saying.  And of course, try out InfoSphere Streams for free with no data or time limit with our Quick Start Edition.

I will close with a thought of the day from business professor Leon C. Megginson: “It is not the strongest of the species that survives, nor the most intelligent, but the one most responsive to change.”

Be open to change. Don’t stay married to your data forever.