Blogs

Add Apache Spark 2.0 and SystemML to your IBM BigInsights tool belt

Senior Offering Manager, IBM BigInsights, IBM

We are excited to announce the availability of Apache Spark 2.0 and Apache SystemML in IBM BigInsights for Apache Hadoop on Bluemix, accessible in the Basic plan under BigInsights for Apache Hadoop. The Basic plan, launched in August 2016, provides a way for users to spin up Apache Hadoop and Apache Spark clusters within a few minutes. Moreover, integration with the Object Storage service lets data persist in the cluster independent of HDFS, allowing deletion of clusters without fear of data movement or loss. http://www.ibmbigdatahub.com/sites/default/files/spark-2-blog.jpg

The latest update provides a powerful way to develop machine learning models using Apache SystemML and deploy them under the Apache Spark 2.0 runtime, and it also offers the ability to choose an IBM Open Platform (IOP) version while initiating a cluster.

  • Apache Spark 2.0 introduces significant enhancements to API usability, R UDF support, structured streaming, SQL 2003 and overall performance improvements. If you’re interested, then be sure to learn more about the exciting new content included in the 2.0 release.
     
  • Apache SystemML is a flexible, scalable machine learning system that provides the ability to customize algorithms using R-like and Python-like languages. It also allows for multiple execution modes, including Spark ML Context and Spark Batch, and it automatically adjusts workload based on data and cluster characteristics with an eye to promoting efficiency and scalability. To find out more, discover how a group of IBM researchers put Apache SystemML to work on compressed linear algebra, and visit the IBM Spark Technology Center for the chance to explore Spark 2.0 and machine learning, performance and Structured Streaming.

The Basic plan, which is based on IOP—an ODPi-compliant Hadoop and Spark distribution—allows users to choose which version of IOP to run while initiating clusters. This latest update includes the IOP v4.3 Tech Preview along with IOP v4.2, which is already available in the current beta. Choose IOP v4.2 to access a stable set of Hadoop and Spark components while using Spark 1.6.1, or choose IOP v4.3 for access to the latest and greatest features and the ability to extend your applications using the power of Spark 2.0 and Apache SystemML.

Go ahead—try out BigInsights for Apache Hadoop on Bluemix. We look forward to your comments and feedback on Stackoverflow and in the Bluemix support channel.