How the Open Data Platform is driving Hadoop's maturation

Director of Marketing for IBM Big Data & Analytics, IBM

Maturity is something that online platforms achieve after fits and starts. Sometimes the maturation process is slow to play out, and sometimes it is lightning-fast.

The web long ago matured into the global phenomenon that we’ve started to refer to as the cloud, building on such earlier development phases as Web 2.0 and the like. The web’s immature early days now seem quaint and laughable—do you remember when the primary way you could access the true richness of information resources on the web was through a handful consumer-grade “portals” such as AOL, Compuserve, Lycos and Yahoo!? It wasn’t until standards-compliant web browsers achieved mass adoption that the internet as we know it truly took off. I'm not old enough to remember the UNIX wars, but I am young enough to remember the Web’s early days with a tinge of embarrassment. At the time, I too thought all of that early stuff was the coolest.

In a different context, Hadoop, as the centerpiece of big data analytics, has matured over the past 10 years to become as fundamental to the world around us as the Web has. Hadoop today sits at the center of an analytics revolution similar to that of the internet. Hadoop’s power lies in its ability to gather all data, enable people from many different backgrounds with self-service provisioning and disseminate data-driven insights into every aspect of modern life.

Where we’re at now in Hadoop’s evolution is not that different from the web of the 90s. Today, data practitioners interact with Hadoop through a set number of interfaces to generate insights, and insights are then deployed through a small number of purpose-built applications spanning a number of industries and horizontal use cases. Now, analytic application development has stagnated due to the complexities of the platform. The challenge of Hadoop has been daunting for some adopters, inasmuch as it is composed of hundreds of independent projects that are nearly impossible to integrate into consumable applications.

The time has come for the analytics industry to work together to harden the platform with a shared vision and foundation to build on to move forward. Open standards that are common to all solution providers are a path in the right direction.

Enter the Open Data Platform (ODP) group of Hadoop contributors and benefactors. Spearheading this new industry initiative are a number of analytics solution providers and other leading data-driven businesses who have a common interest in maturing the Hadoop ecosystem toward open, vendor-agnostic interoperability frameworks and interfaces. IBM is a principal member of ODP, and we’ve placed the highest strategic importance on working within the initiative to develop an open framework for vendor-agnostic Hadoop interoperability.

Since the ODP launched earlier this year, the group has already defined the core components for not just one Hadoop distribution, but many, ensuring increased adoption and reducing the risk of being locked in to any single vendor. IBM, Hortonworks, Pivotal and the ODP community are now developing on the latest version of Apache Hadoop version 2.6. Additionally, the ODP group is working to ensure that the installation and management of disparate Hadoop components are seamless, as each ODP solution includes an easy to use and free application manager—the open-source Apache Ambari.

With such a diverse set of members you can be sure there have been no modifications or tweaks to the codebase protecting your Hadoop investment for years to come. Hadoop was started through community-wide collaboration, and staying true to this spirit means the continuous inclusion of new members in the ODP—not only those who were first. Membership to the ODP requires dues just as any non-profit organization would. In fact, you don't have to look further than the Apache Software Foundation to see that this is the case with its tiered membership as well.

Those who have opposed industry standards have found themselves quickly struggling to stay relevant as the rest of technology community moves forward. If it were left to a handful of vendors to control the fate of the internet we undoubtedly would not have seen the emergence of the social, search, ecommerce and rich media we all benefit from today. It was the establishment of standards that enabled the next generation of the internet by freeing innovators to not waste time with configuration.

We encourage all interested parties to join the ODP. Visit ODP’s site to learn more about the initiative, provide your contributions and engage with us further going forward.

For more information, visit the Hadoop for the Enterprise page to learn about the power of openness (an intrinsic feature of IBM InfoSphere BigInsights v4 with Apache Hadoop) and read the latest release.

Try IBM Open Platform with Apache Hadoop