IBM Streams: A 10-year anniversary, and what's next

Infuse AI into your data with Streams v5.0 for IBM Cloud Pak for Data

InfoSphere Streams Product Manager

IBM Streams was first available on May 15, 2009, exactly 10 years ago today. Happy anniversary!

From System S to Streams

IBM Streams evolved from a five-year collaboration between IBM Research and the U.S. government. The original goals were to:

  • Create a system to ingest unprecedented volumes of data
  • Analyze data arriving at extremely high velocity
  • Handle a variety of data, both structured and unstructured.

You can glean some Streams history from the IBM Research pages for System S project. The “S” was short for Streams, to play off the earlier successf of Project R from IBM Research – the “R” was for “relational.” The publications tab for System S includes more than 120 technical white papers dating back to 2004. The first one published was “Interval query indexing for efficient stream processing.”

A few early adopters like the University of Ontario Institute of Technology, University of Uppsala and KTH Royal Institute of Technology in Stockholm helped shape the runtime and language. This short video describes their use cases, which include neonatal intensive care unit monitoring, space weather prediction and traffic monitoring. Since then, businesses have used Streams to unlock value throughout the enterprise. One transportation company was documented to have 150 percent ROI using Streams.

Just before the seventh release of System S in 2008, IBM decided to turn this IBM Research project into an offering. System S version 3.2 included runtime and programming model that contained 10 operators and the ability to extend with custom C/C++ or Java programs. Advanced capabilities included back pressure and an optimizing compiler to spread applications across a clustered runtime.

Soon after Streams v1.0 became available, the product’s focus was condensed to reflect volume, velocity and variety.

The first few releases continued use of the early version of Stream Processing Language (SPL) called the Stream Processing Application Declarative Engine. With Version 2.0, a major effort was made to simplify and standardize SPL so all operators would behave consistently to simplify learning and development.

Since the last Research System S release 10 years ago, 15 major releases added new functionality, described in the timeline above. We put a lot of emphasis on developer tools and added a wealth of analytics capabilities.

A few highlights of functions added to Streams over the years:

  • Dozens of analytic capabilities like native machine learning, model scoring, timeseries analysis, rules. forecasting, and geospatial data analysis.
  • From the original 10 operators, now over 200 operators
  • Visualization of streaming data
  • Visual drag-and-drop development in 2012
  • Java development in 2015 and Python development in 2016
  • Apache Beam development in 2018
  • At least once and exactly once data processing

Thanks to Streams, we made many contributions to the open source community. Scores of additional operators are available on GitHub, including:

  • Connectors for NoSQL Key-Value stores like MongoDB and Redis
  • OpenCV toolkit and operators for video and image analytics.
  • Plugins for VSCode and Atom allow creating SPL programs using popular editors.

Streams v5.0 brings real-time analytics to the private cloud

The latest release, Streams v5.0 for IBM Cloud Pak for Data (ICP for Data, formerly IBM Cloud Private for Data) provides a real-time engine within our data platform. The platform simplifies bringing artificial intelligence (AI) into your enterprise processes. It can collect, organize, analyze and infuse AI into your business. Streams is ideally suited for taking your AI models and infusing them throughout your company. Watch this webinar to learn more about Streams on the IBM Cloud Pak for Data platform.

One customer is running more than 1,700 AI models in Streams and achieving the following:

  • Revenues up 50 Percent with improved click-stream advertising
  • Enhanced chat bot conversations
  • Predict and anticipate customers calling their call centers

By anticipating callers and sending recommendations to solve problems on lower-cost channels, they expect to have more than one million calls handled this year through lower-cost channels.

Over the past 10 years, IBM Streams has led the industry in the streaming analytics market with some of the most advanced use cases across many industries. As we continue to help companies infuse AI into their business processes with continuous intelligence, IBM aims to help clients drive down costs and increase revenues to improve outcomes.

If you haven’t tried out Streams yet, you’ve missed ten years of opportunity. Isn’t it time to see what real streaming analytics is all about? Visit IBM Streams to learn more and try it out.