Putting data to work at Strata + Hadoop World 2016
Cognitive business took a bold leap forward in New York City the week of 26 September 2016. At two events on Manhattan’s west side, IBM led customers, partners and industry at large in an exploration of how to put machine learning, artificial intelligence (AI), and big data analytics to work.
At the extremely well-attended IBM DataFirst Launch Event at Hudson Mercantile, the chief news was the announcement of Project DataWorks. This new, cloud-based offering provides a self-service environment for teams of data scientists, data engineers and other professionals to collaboratively develop, iterate and deploy sophisticated AI, cognitive computing, machine learning and other advanced analytics. Check out what Dinesh Nirmal, vice president, IBM Analytics, had to say about DataWorks in action during the Strata + Hadoop World 2016 conference.
And at nearby Strata + Hadoop World 2016, IBM further amplified, deepened and demonstrated the power of DataWorks, the new Data Science Experience and other innovative solutions to drive the productivity of data science teams working on complex initiatives.
At the Strata conference, IBM also announced the release of IBM Big SQL on Hortonworks Data Platform (HDP), which you can read more about in Andrea Braida’s detailed blog. Braida explains how this new SQL-on-Apache Hadoop offering adds value to IBM BigInsights and carries forward our commitment to the Open Data Platform initiative (ODPi). In addition, check out the Cube livestream playback of remarks on Big SQL on HDP by Berni Schiefer, a fellow in the IBM Spark Technology Center.
Rob Thomas, IBM Analytics vice president for product development, did a keynote on the topic of how successful modern businesses think data first. Thomas provided a quick demo of the power of the new IBM Project DataWorks and Data Science Experience to solve real problems with real data. Also see Thomas’s related remarks on the Cube in conjunction with the DataFirst launch event.
In a breakout session, Raj Krishnamurthy presented an in-depth dissection of techniques for tuning Apache Spark machine-learning workloads. He discussed how Spark’s efficiency and speed helps reduce the cost of running existing clusters. Krishnamurthy illustrated how Spark’s performance advantages can allow it to complete processing in significantly shorter batch windows with higher performance per dollar. And he walked through an alternating least squares-based matrix factorization workload able to improve runtimes.
In another breakout session, Schiefer discussed ODPi as a foundation for cross-distribution Hadoop interoperability. He described how, with so much variance across Hadoop distributions, ODPi was established to create standards for both Hadoop components and testing applications on those components. He explored how application developers and companies considering Hadoop can benefit from ODPi.
Holden Karau, an IBM software development engineer, and Seth Hendrickson, an IBM data scientist, presented the basics on Spark Structured Streaming for machine learning. They demonstrate how to do streaming machine learning using Spark’s new Structured Streaming and walked their audience through the process of creating their own streaming models. They also covered how to use structured machine learning algorithms, as well as Spark’s Structured Streaming application programming interface (API) and how machine learning works in Spark. Also, check out Karau’s recent Spark Technology Center blog on this topic.
The IBM booth was also buzzing with intelligent activity at Strata + Hadoop World. For example, Marvin the Robot drove the cognitive Rock, Paper, Scissors Grand Challenge by exercising his data-driven algorithmic intelligence to the delight of conference goers. Learn more about Marvin, the challenge, and the enabling technology.
For more information and perspective on these announcements, see these digital resources:
- IBM Unveils Industry’s First Platform to Integrate All Data Types for AI-Powered Decision Making, IBM news release, September 2016.
- IBM Project DataWorks with Watson
- IBM DSX
- DataFirst Method
- IBM Watson DataWorks Project Partners
- IBM Big SQL
- Simplifying the deployment of data-driven business innovations, IBM Big Data & Analytics Hub, September 2016.
- Chief Takeaways from the IBM DataFirst Launch Event, IBM Big Data & Analytics Hub, September 2016.
- Announcing IBM Big SQL on Hortonworks Data Platform, IBM Big Data & Analytics Hub, September 2016.