Approaching the big data starting line with the Quick Start Program
It’s a new year, and you know what that means. Resolutions are set, reflections are high and if you’re like me, you’re trying to find new ways to make life a bit simpler. Thanks to the IBM Big Data Quick Start Program, you can make your big data life simpler in 2014 as well.
Tell me this, when it comes to big data, what words pop in your head? Maybe Hadoop? Analytics? Complexity? Volume? How about the words easy and free? Probably not so much, right? That’s what the Quick Start Program is all about.
With Quick Start, you can download key big data technologies for free. They’re designed to make it easy for you to get started and explore different use cases. The simplicity of the Quick Start program is two-pronged:
- First, we design the products with the end user in mind. We know that not everybody can be a Hadoop expert, for example. Not every organization will have people that specialize in real-time analytic processing. But with features like drag and drop graphical editors, analytics toolkits and visualization, much of the hard work in building big data applications is already done for you.
- Second, we pair the Quick Start offerings with free education. You can join the more than 130,000 registered members on Big Data University to take free courses. You can follow tutorials that walk you through how to use these products, and you can watch videos that demonstrate how to get started.
What Products are available through the Quick Start Program?
InfoSphere BigInsights is IBM’s Hadoop-based offering, but what makes it different is that it’s designed for the enterprise. This means we start with and preserve Apache open source Hadoop, and then extend its value by adding enterprise features on top of it. These features include:
- Text analytics: sophisticated text analytics unique to BigInsights with a vast library of extractors enabling actionable insights from large amounts of native textual data
- BigSheets: web-based analysis and visualization tool with a familiar, spreadsheet-like interface that enables analysis of large amounts of data and helps to design and manage long running data collection jobs
- Big SQL: native SQL query engine that enables SQL access to data stored in BigInsights, leveraging MapReduce for complex data sets and direct access for smaller queries
- Workload Optimization: adaptive MapReduce adapts to user needs and system workloads automatically to improve performance and simplify job tuning while workload scheduler provides optimization and control of job scheduling based on user-selected metrics
- Development Tools: familiar, Eclipse based development environment for building and deploying analytic applications and a set of developer tools extractors and editors for fast adoption and reduced coding and debugging
- Management Capabilities: auditing helps tighten security and access control while monitoring provides the ability to control all applications from a centralized dashboard
We’ve made all of the above capabilities available in our BigInsights Quick Start offering, so when you download it, you’ll be able to get your hands on all of them.
InfoSphere Streams is IBM’s unique real-time analytic processing solution. This means you can analyze data in-motion, and lots of it—even up to millions of events per second!
The need for analysis of real-time data is a critical and growing requirement, and no other offering gives you capabilities this powerful. You may be hearing more about similar offerings, based on alpha code, which shows that more people are recognizing the need for real-time analysis. But Streams has been in production with customers for years. This is real real-time analytic processing. The sheer number of analytic toolkits built into the product speaks to its robustness. And with Quick Start, you can get your hands on these toolkits to start playing around with them.
InfoSphere Streams comes with:
- Comprehensive Development Tools: to make it easier to build applications
- Scale-Out Architecture: near limitless capability that gives you the ability to scale from a single server to an unlimited number of nodes to process millions of events per second with microsecond latency
Sophisticated Analytics: several analytic toolkits, including:
- Geospatial. Enable location based services with high performance analysis and processing of geospatial data
- Time Series. Perform a rich set of functions that include generation (synthesizing or extracting), pre-processing (preparation and conditioning), analysis (statistics, correlations, decomposition and transformation) and modeling (prediction, regression and tracking)
- R analytics. Perform data analysis with statistical, mining and modeling capabilities
- Complex Event Processing. Detect composite events in streams of simple events using patterns
- SPSS. Develop and build predictive models using IBM SPSS Modeler and then run these models on Streams for real time scoring and predictions
- Advanced Text Analytics. Natural language processing for a variety of tasks such as sentiment analysis and understanding intent to purchase
Taking it to the next level: BigInsights and Streams together
BigInsights and Streams were the logical launch points for the Quick Start Program because they share a common set of analytic toolkits and accelerators. Usually clients analyze data-in-motion with Streams and then store analytic insight to BigInsights for additional analysis such as trend correlation or historical analysis.
Together, clients get the most comprehensive big data technologies in the market with these two solutions that are key components of the big data platform. IBM is unique in having developed an enterprise class big data platform that allows you to address the full spectrum of big data business challenges. The real benefit of the platform is leverage: the ability to start with one capability and easily add others over your big data journey.
How can you get started with Quick Start?
Download from these websites:
- BigInsights Quick Start: http://ibm.com/infosphere/quickstart
- Streams Quick Start: http://ibm.com/infosphere/streams-quickstart
There’s no time limit and no data limit, which means you can get educated at your own pace and start exploring which use cases make sense for you and your organization. These are non-production versions with no support option, so play with them and when you’re ready to put into production, you can get the full versions.
- Listen: Quick Start Program Podcast
- Download: InfoSphere BigInsights for Hadoop Quick Start
- Download: InfoSphere Streams white paper
- Connect: Big Data Community
- Learn: Big Data Zone