Blogs

Set fire to your cloud with a single Spark

Content Marketing Manager, IBM

Apache Spark is on fire.

What’s so great about Spark? Well, if you haven’t heard, Spark is a fast, easy-to-use cluster computing system for large-scale data processing that is 10–100 times faster than existing big data technologies on the market today. One way Spark accomplishes this speed is through aggressively cached in-memory distributed computing. Spark can also consolidate jobs that other technologies would need to perform one after another.

Another perk: Spark includes extension libraries for machine learning, streaming, data frames, graph analysis and SQL programming models, all of which can be developed using a common programming framework across space and extensions. 

In short, if you’re a data scientist, data engineer or application developer interested in working with big data analytics, you can do more in less time with Spark.

That all sounds great, but it takes a fair amount of resource and investment to get a new big data project off the ground. There are hard questions to answer about infrastructure, resources, expertise and concerns about up-front investment and risk. Organizations can mitigate a lot of that risk and complexity by experimenting with new technologies as managed service offerings on the cloud.

http://www.ibmbigdatahub.com/sites/default/files/cloudspark_embed.jpgCloud-based delivery of Spark can address these concerns, alleviating much of the hassle associated with running a self-service deployment of the open source technology. With a managed cloud service, it’s possible to access the Spark next-generation performance and capabilities on demand, pay only for what you use and scale as you grow.

Today’s announcement of IBM Analytics for Apache Spark addresses those requirements. A case in point is SolutionInc. This company is an Internet gateway provider that offers on-premises and cloud-based solutions for managing high-demand public Wi-Fi in hotels, conference centers, parks and other public buildings in over 50 countries worldwide. The company knew there were insights buried in their mountains of data that could be used to benefit their clients. Working with the jStart team at IBM, they turned on the new Spark service and were uncovering information about their busiest access points and most frequent users in just a few weeks. Learn more about SolutionInc’s implementation in John Feller’s article.

Using the service, you can provision a Spark instance in IBM Bluemix in minutes and jump right into an integrated notebook. Notebooks provide an interactive computational environment to perform analytic tasks on data coming from diverse sources, and combines code execution, rich text, mathematics, plots and rich media together in one place. Start coding in Python, Scala or Java, import a notebook, or use one of the sample notebooks provided.

You can also easily integrate Spark applications with other cloud data services and third-party tools. That means more of your data can be stored and managed in the same place, which cuts down on the complexity and headaches that come along with managing disparate systems and tools.

Take your Spark journey to the next step. IBM invites you to a free 3-month trial of IBM Analytics for Apache Spark and IBM Cloudant. Use Spark in the cloud to conduct fast in-memory analytics on your Cloudant JSON data. Sign up today and also receive free SaaS Startup Advisory Services to help you accelerate your time to results.