On Tuesday, I plunged right back into Spark Summit—which, if anything, was buzzing more vigorously with interesting content than it had been the day before. Not surprisingly, IBM’s Spark announcements were the talk of the show.
A growing body of fresh thinking is coming down the pike. Much of it will come from the droves of IBMer data scientists who participated in the recent and wildly successful internal Hack Spark Challenge, as well as ongoing IBM-sponsored hackathons, meetups and developer days focusing on Spark.
Combining data, design and speed, IBM and Spark are creating a new blueprint of innovation together. This is the start of something big. IBM and Spark - Power of data. Simplicity of design. Speed of innovation.
Apache Spark is at heart an open-source community, but it is going well beyond that identity to also develop into a substantial sector of the analytics market. However, Spark will not be able to achieve its full potential if a robust industry ecosystem does not develop around it.
Something palpable was in the air at Hadoop Summit 2015 that confirmed a new next-big-thing in big data analytics is on the horizon. As this year’s Summit drew to a close, the community enthusiastically looks forward to the emergence of Spark.
Scaling big data analytics applications is expected to become impractical given the rate of increasing volumes, heterogeneous varieties and velocities of data. Continued advances in machine learning are critical to enable data scientists to automatically generate machine learning models for rapidly
Hadoop’s commercial maturation took a big leap forward with the recent establishment of the Open Data Platform (ODP) group, which has created a common interoperability framework. ODP provides users and ISVs with assurances that there is a tested Hadoop core, allowing them to focus on building value
Day two at Hadoop Summit went well beyond the opening day theme of Hadoop’s transformative power for enterprises. The many competing Hadoop ecosystem subprojects in play may be an indication of just how ambiguously Hadoop’s enterprise market boundaries overlap with adjacent segments.
It’s clear that Hadoop is nearing maturity, but if this year’s summit is any indication, this segment remains vibrant and innovative. Indeed, many of the sessions addressed significant gaps in our own knowledge of this fast-moving space.
Apache Spark is gaining considerable notice in the data science community, and the technology was showcased in the recent debut of a Spark hackathon series. Take a look at a web server enabling Spark cloud instances to serve as web end points and an application to predict stock movement that were
Apache Spark is arguably surpassing Apache Hadoop as the preferred big data analytics development platform. Yet, the expected specialized algorithm and model libraries that emerge from the Spark community raise the specter of platform bloat that may perhaps put Spark at risk of becoming too bloated
Get in on the widespread excitement over Apache Spark. Check out the highlights from a recent SparkInsight CrowdChat that tackled six key questions about this next-generation, cluster-computing, runtime processing environment and development framework for in-memory processing of advanced analytics.
An increasing number of use cases for big data and analytics can be Apache Spark's sweet spots. Take a look at several low-latency applications in which Spark is well-suited for analysis of cached, live data.