Hadoop’s commercial maturation took a big leap forward with the recent establishment of the Open Data Platform (ODP) group, which has created a common interoperability framework. ODP provides users and ISVs with assurances that there is a tested Hadoop core, allowing them to focus on building value
Day two at Hadoop Summit went well beyond the opening day theme of Hadoop’s transformative power for enterprises. The many competing Hadoop ecosystem subprojects in play may be an indication of just how ambiguously Hadoop’s enterprise market boundaries overlap with adjacent segments.
It’s clear that Hadoop is nearing maturity, but if this year’s summit is any indication, this segment remains vibrant and innovative. Indeed, many of the sessions addressed significant gaps in our own knowledge of this fast-moving space.
Apache Spark is gaining considerable notice in the data science community, and the technology was showcased in the recent debut of a Spark hackathon series. Take a look at a web server enabling Spark cloud instances to serve as web end points and an application to predict stock movement that were
Apache Spark is arguably surpassing Apache Hadoop as the preferred big data analytics development platform. Yet, the expected specialized algorithm and model libraries that emerge from the Spark community raise the specter of platform bloat that may perhaps put Spark at risk of becoming too bloated
Apache Spark is unfamiliar to many data analytics professionals. A recent post provides high-level guidance on how they might begin to identify the applications for which Spark is well suited. This post expands on that discussion to offer further details for triggering the creative imaginations of