Spark in the world beyond Silicon Valley
Though incubated in Silicon Valley and environs, Apache Spark is very much a global phenomenon. Like Apache Hadoop and big data analytics in general, Spark is being adopted all around the world. And it is being embraced by the new generation of data science professionals as the power tool for analytics challenges requiring big data, machine learning, in-memory computation, stream computing and graph analysis.
How global is Spark in its adoption? There are various answers to that question that can be addressed on several levels. IBM’s embracing of Spark is global on many levels. As mentioned in my recent blog posting, Spark and the crux of differentiation, IBM is adopting Spark throughout its revenue-producing, solution-and-service portfolio. IBM is also accelerating research and development investment into Spark at its labs and development centers throughout the world. And IBM associates, IBM Business Partners and customers are being educated on Spark through IBM’s own professional services organization, the Big Data University online resource and strategic Business Partners around the world.
Another data point on Spark adoption throughout IBM worldwide is an internal Hack Spark Challenge conducted in May–June 2015. As Rob Thomas, vice president, product development, IBM Analytics, at IBM, reported in a recent blog post, the contest had over 100 submissions in just the first 10 days. Of those submissions, responses from nine participants were pulled into a Big Data & Analytics Hub listicle. Though this posting offered a purely convenient sample of all participants, that only one of the nine data scientists in that post could in any way be regarded as hailing from Silicon Valley is worth noting. In fact, one was from Bangalore, India; two were from the US, one from Research Triangle Park and one from San Francisco; three were from Toronto, Canada; one was from Shanghai, China; one was from Tokyo, Japan; and one was from Dublin, Ireland. Clearly, Asians and Europeans—and data scientists elsewhere—are working Spark.
And in response to demand from customers and partners worldwide, IBM is starting a monthly program of community events, dubbed Datapalooza. It is aimed at the new generation of data scientists and focused on accelerating applications on Spark as a Service. Starting November 10–12, 2015, in San Francisco, California, IBM will be holding subsequent Datapaloozas in Tel Aviv, Israel; Berlin, Germany; Tokyo, Japan; New York, New York; London, England; Sydney, Australia; Bangalore, India; Sao Paulo, Brazil; Hong Kong; and other cities.
If you think that IBM is not a representative data point on worldwide Spark adoption, think again. Check out the latest list of organizations and projects powered by Spark from the Spark website. Many nations are amply represented among both the vendors and users who had adopted Spark into a wide range of cloud-facing, analytics-powered, big data applications.
As another data point on the worldwide interest in Spark, the Spark website’s current list of developer events focused on Spark literally circles the globe. Upcoming events will take place throughout North America and beyond:
- Barcelona, Spain
- Bangalore, India
- Berlin, Germany
- Beijing, China
- Hangzhou, China
- Hyderabad, India
- Ljubljana, Slovenia
- London, England
- Moscow, Russia
- Mumbai, India
- Shanghai, China
- Shenzhen, China
- Tokyo, Japan
- Zagreb, Croatia
Another fundamental data point, or rather a fire hose, involves a quick look at chatter on Twitter’s #apachespark hashtag. The participants in that discussion are all over this planet, and they’re aggressively innovating with Spark tools to solve highly pressing scientific, business and other challenges facing the human race.
Take the opportunity to engage with peers by registering for Spark Summit Europe, 27–29 October 2015, in Amsterdam, the Kingdom of the Netherlands. Learn how to use Spark as a service on the IBM Bluemix cloud platform to address urgent business challenges.
In addition, deepen your data science explorations for an open and unified analytics platform, Hadoop and Spark, as well as IBM resources on Spark thought leadership, IBM Big Data & Analytics Hub thought leadership content on Spark, the IBM Spark Technology Center and an IBM Big Data University Spark Fundamentals course.
And of course, be sure to register for Datapalooza, and stay tuned to see when an event is coming to a city near you.