IBM Insight 2015: The importance of openness to the analytics platform
At Insight 2015, the IBM conference in Las Vegas this week, I’ve heard countless references to the importance of open systems, open technologies and an open IBM platform. Beneath the conference theme, “Lead in the Insight Economy,” there’s at least one important if previously unwritten sub-theme: “Lead in the Open.”
Let me give a few examples of what I mean. On Tuesday, the general session focused on data science and how it can change the way work gets done today. Beth Smith, General Manager, Analytics Platform, commented on innovations enabled by open source and their importance to organizations today.
In the Solution EXPO, signs indicate that the IBM Analytics Platform is “built on Spark,” and around the EXPO zones focused on different aspects of the platform, the signs of Apache Spark seem to be everywhere.
Tuesday afternoon, I was able to join a keynote focused on Hadoop, Spark and distributed stream computing. In view of that focus, it was not a surprise that open source was front and center in the comments of Rod Smith, IBM Fellow and Vice President for Emerging Technologies. Adding more depth to the “built on Spark” comment, he pointed out that IBM made the decision to unify the platform on Spark to enable organizations to simplify their data architectures and spend more time on business imperatives and less time on data management.
Describing how the platform is evolving, Smith mentioned the new Data Science Workbench, a notebook environment to help data scientists re-use and extend their work without moving the data itself. The intent is to reduce the burden on IT and also to satisfy the demands of data scientists by giving them self-service access to data and analytics capabilities.
For a closer look at the technology underpinning the remarks about open systems, the logical stopping point is the Demo Center, where the rubber meets the road. There I found demonstrations of open technology to keep me busy for more time than I had available to spend. Here are a few examples of technology on display.
- IBM predictive and prescriptive analytics with advanced tools for functions like machine data analytics for processing large amounts of data from log files, sensors and other sources
- Spark machine learning acceleration for analytics workloads
- BigSQL, a SQL engine for Hadoop that’s written in native C/C++ code for increased performance, concurrency and security, but that also preserves the open storage formats and metastore of Hive, making access to Hive data faster and more secure.
- The combination of IBM BigSheets and System T, adding value to an open source Hadoop foundation by helping developers to build apps that can process text in multiple languages and derive insights from large amounts of text in a variety of formats
- A visual predictive analytics workbench that enables analysts to take advantage of frameworks like Hadoop and Spark in an environment where coding is supported but not required
- Project RedRock, powered by IBM Analytics running on Spark, which opens the world of big data to non-technical users by processing terabytes of data and providing an intuitive interface to uncover and act on data-driven insights discovered from Twitter
There’s more, but these examples illustrate the idea of openness across the conference and the IBM Analytics Platform. The opportunities for accelerated innovation, community-based sharing, and technology advances in open source are in full display. Organizations seeking to lead in the insight economy would do well to prepare themselves by tapping into these open source approaches to accelerate innovation.