Rapid Analysis of Real-Time Streaming Data

Efficiently handle the demands of streaming data from disparate sources with InfoSphere Streams Version 4.0

Manager of Portfolio Strategy, IBM

The refrain, “I want it all, and I want it now,” from the song “I Want It All” by British rock band, Queen, is very likely a theme in many organizations. A lot of line-of-business executives these days are demanding more and faster insight than ever before. However, the ability to grab data from today’s ubiquitous data sources is technically quite challenging, especially considering that a large number of organizations still use spreadsheet applications as a key component of business intelligence (BI). The spreadsheet is deeply entrenched in corporate culture, despite its limitations. For example, data needs to be cleaned and structured before it can be analyzed. It needs to fit nicely into rows and columns to be accessible for business analysts. The time to rethink this strategy has arrived.

Going beyond analysis of data at rest

Data from emerging sources such as sensors or social media can’t be transformed fast enough because of tremendous complexity and speed—yet this data is critical for analysis. Current developments in the business world require more than data-at-rest analysis (see Figure 1). Figure 1. Current developments and their business impact on enterprises and organizations

Current development

Business impact

Data is moving from batch to real-time analytics.

Rapid decisions are required to keep pace with competitors. “Forrester survey data revealed a 66 percent increase in firms’ use of streaming analytics in the past two years.”1

Organizations are challenged to keep up with fast data.

The value of data decreases over time.

Opportunities are missed and security and fraud risks exist despite analytics.

Organizations can waste more than a million US dollars and thousands of hours annually on false positives.

Data volume—from mobile devices, sensors, and social media—is increasing, but the ability to make sense of it is declining.

As incoming data escalates, organizations can make sense of only diminishing proportions of their data—anywhere from 7 percent down to 1 percent.2

Machine data is emerging.

Some analysts claim that a significant portion of data will be machine generated within the next five years.

  Organizations looking to tackle these challenges and drive change can deploy IBM® InfoSphere® Streams Version 4.0 streaming data analytics, which enables streaming data to now be readily available directly in spreadsheets. Business users can get up and running with streaming data in spreadsheets in minutes, and drag-and-drop capabilities allow them to deliver information quickly and easily onto a worksheet. The full power of the spreadsheet capability is available including charts and formulas and the ability to cut and paste data. And spreadsheets can be easily saved and shared with others. InfoSphere Streams can filter streaming data at high speeds and land relevant data to analysts in a worksheet or any other BI platform. As a result, rapid prototyping of real-time applications is possible as well as increased agility for fast, successful outcomes. For example, business analysts can quickly search streaming data for key performance indicators (KPIs). These indicators can include number of web page visits, patterns detected for market basket analyses, or suspicious activities spotted such as multiple credit purchases on the same account in different time zones. InfoSphere Streams also advances processing of streaming data. Organizations can bring data from streaming data sources into mission-critical, front-end, revenue-generating applications such as customer care, operations analysis, risk management, and security. Resiliency of streaming applications is now a priority for many organizations even though the complexity of new and fast-moving data sources is increasing. To meet service levels, deliver enhanced customer care, and implement optimal action, organizations need to help ensure all streaming data is processed. InfoSphere Streams Version 4.0 is designed with simple operators and annotations to help confirm no data is lost during processing. And InfoSphere Streams Version 4.0 benefits can extend beyond line-of-business users to IT developers and administrators. It empowers IT professionals and others with a wide range of technical skill levels to gain deeper insight into the performance of operations and applications. In today’s engaged world, one second can mean the difference between success and failure, and for some organizations a five-minute delay could mean the business goes elsewhere. InfoSphere Streams Version 4.0 offers an advanced administration console, a Java Management Extensions (JMX) management and monitoring application programming interface (API), easy security implementation, and adoption of Apache Zookeeper (see Figure 2). Rapid Analysis of Real-Time Streaming Data – Figure 2 Figure 2. Charting and viewing streaming data in an interface with the capability to customize colors

Moving toward instantaneous analysis

Organizations challenged by increasing demands for more data, more often should consider their needs for advancing analysis of streaming, real-time data. Otherwise, opportunities may be missed. InfoSphere Streams Version 4.0 helps organizations address emerging requirements to analyze fast-moving data.3 Please share any thoughts or questions in the comments. 1The Forrester Wave™: Big Data Streaming Analytics Platforms, Q3 2014,” by Mike Gualtieri and Rowan Curran, Forrester Research, July 2014. Based on a survey of 746 North American and European technical decision makers. Source: Forrester’s Business Technographics Global Data and Analytics Survey, 2014. 2Enterprise Amnesia versus Enterprise Intelligence,” presentation by Jeff Jonas, IBM Fellow, IBM® Redbooks® video, January 2013. 3 Stream Computing website at