Announcing IBM Big SQL on Hortonworks Data Platform

Portfolio Marketing Manager, IBM

IBM is announcing the general availability of IBM Big SQL on Hortonworks Data Platform (HDP). IBM Big SQL is a powerful and sophisticated SQL-on-Apache Hadoop engine, and extending its capability to Hortonworks gives organizations an additional choice of platforms—either IBM Open Platform (IOP) or HDP. This offering is happy news for many Hortonworks customers; Big SQL is a highly requested feature of the IBM BigInsights suite of capabilities by organizations who have already deployed Hortonworks clusters.

The Big SQL announcement is aligned with IBM’s work with the Open Data Platform initiative (ODPi). Both Hortonworks and IBM are part of that consortium, and IOP 4.2 and Hortonworks 2.4 are both ODPi-compliant releases. Many of the interfaces that HDP and IOP use are the same, which makes both platforms consistent for core components such as the Hadoop Distributed File System (HDFS), YARN, and MapReduce—helping reduce the cost to support both IOP and HDP with a single source code base.

Building on decades of enterprise expertise

IBM invented the SQL language and the concept of a query optimizer back in the late 1970s. Today, how and where data is stored has certainly changed, but the expertise to optimize queries derived from decades of advanced research remains a core IBM competency. A critical capability for big data is the ability to support complex analytics with SQL, and many organizations demand this capability for Hortonworks Hadoop clusters.

Hortonworks customers can now continue to use and retain their HDP clusters that might have hundreds of terabytes—even petabytes—of data, and easily overlay a powerful SQL on Hadoop engine. Organizations using Apache Hive tables today can retain those tables as they are—for example, in Optimized Row Columnar (ORC) format because the Big SQL engine executes over existing Hive tables and integrates with the Hive metastore.

Supporting open formats and more

I recently sat down with the IBM open source leadership team, who shared that Dr. Phil Shelly, president of DataMetica Solutions, is a huge advocate of Big SQL, “Big SQL is the big data industry’s best kept secret, and I’m excited that now it’s available for Hortonworks.” Dr. Shelley, a former CTO at Sears, has extensive experience driving modernization, innovation and initiatives in cloud computing and the big data space, and years of experience delivering large-scale custom solutions for analytics In addition to that endorsement, here’s a summary of other exciting new benefits:

  • Big SQL demonstrates the IBM endorsement of open formats such as Avro, JavaScript Object Notation (JSON), ORC, Parquet and many others that are open to all vendors, competitors and the like.
  • Big SQL continues to demonstrate more functionality than other SQL on Hadoop engines, including additional functions, performance and enterprise features such as security and workload management.
  • General improvements in version 4.2 include enhanced out-of-box performance with less tuning, such as auto-analysis; improved memory management and operational stability; high-performance transactional support, such as exploiting HBase; and Apache Spark integration, which is in technology preview. And Spark executors can directly communicate with Big SQL worker nodes in parallel.
  • Big SQL now understands SQL dialects from other offerings such as IBM DB2 database and IBM Netezza data warehouse appliances and Oracle database, making it a platform well suited for relational database management systems (RDBMS) offload and consolidation is fast and easy. Data can be offloaded from existing enterprise data warehouses or data marts to free up capacity while preserving most of the familiar SQL from those platforms.
  • Big SQL allows organizations to access data across Hadoop and relational databases, whether they are on the cloud, on premises, or both, using a single database connection.
  • Big SQL is the only SQL engine for Hadoop that exploits Apache Hive, Apache HBase, and Spark concurrently for best-in-class analytics capabilities. 

In a forthcoming blog post next week, I’ll post a Q&A interview with one of our Big SQL experts, so please stay tuned for that. In the interim, for more information about IBM Big SQL on Hortonworks, please visit our Big SQL solution page.

IBM Big SQL is part of IBM BigInsights, and is available on premises or in cloud-based environments with enterprise-class support. IBM BigInsights is also integrated with a broad and open ecosystem of data and analytics tools.

Want to learn more about Hadoop? Get started with a complimentary version the IBM Open Platform (IOP) with Apache Hadoop and Spark.