10 can’t miss posts for big data and analytics developers: March edition

Do you know how many olives it takes to make that tablespoon of olive oil you just doused on your lunch salad? It takes a big pile of olives. More than 50.

Now, guess how much data it takes to grasp an accurate picture of market trends, emerging markets or potential security breaches; how much do you think? Well, it takes a big pile of data. More than you know what to do with, in fact.

That’s where the tools and technologies you read about on developerWorks come into play. To squeeze down all that data and extract meaning from it, you’ve got to build specialized skills and implement methods designed for the job.

Look over this collection of how-to materials and check out what is working for others. With the right resources, you can turn volumes of data into key market insight.

  • Big data roundup: If you haven’t already subscribed, sign up for this weekly newsletter—one of the easiest ways we know of to keep tabs on big data technology.
  1. Download InfoSphere BigInsights Quick Start Edition: Get this free, non-production version of BigInsights that makes it possible to create new solutions that cost effectively turn large, complex volumes of data into insight by combining Apache Hadoop (including the MapReduce framework and the Hadoop Distributed File Systems) with unique, enterprise-ready technologies and capabilities from across IBM, including Big SQL, text analytics and BigSheets.
  2. IBM Big SQL vs. HBase: Easy way to handle queries and create business intelligence reports: Learn about query handling and how to connect to Big SQL via JDBC to run business intelligence and reporting tools such as BIRT or Cognos.
  3. Process small, compressed files in Hadoop using CombineFileInputFormat: Discover how to use CombineFileInputFormat within the MapReduce framework to decouple the amount of data a Mapper consumes from the block size of the files in HDFS.
  4. IBM Big SQL vs. HBase: Create tables and load data: Learn fundamental usage of IBM's Big SQL technology for Hadoop over HBase by creating tables and examining ways to load data. Follow a basic storyline of migrating a relational table to HBase using Big SQL.
  5. Process complex text for information mining: Even the basic task of picking out specific words, phrases or ideas from raw text is challenging. Learn how AQL and InfoSphere BigInsights can help you process text into meaningful data that can be converted to usable information.
  6. Download InfoSphere Streams Quick Start Edition: Try out this free, non-production version of InfoSphere Streams, a high-performance analytics platform that allows user-developed applications to rapidly ingest, analyze and correlate information as it arrives from thousands of real-time sources. Experiment with stream computing in your own unique environment. Build a powerful analytics platform that can handle incredibility high data throughput, up to millions of events or messages per second.
  7. Improve your agile development lifecycle with SPSS: Build a lean analytics strategy that empowers project managers to optimize their development lifecycle by focusing attention only where needed. Doing so gives product teams the opportunity to make real-time decisions about feature implementation.
  8. IBM Business analytics proven practices: Revival procedure for predictive maintenance and quality solution: Get the IBM Predictive Maintenance and Quality environment back to the right state using the detailed steps in this article. Each section describes the process for starting each node to retain its integrity.
  9. Practical data mining of vague and uncertain data: Introducing fuzzy association rule mining: With larger and larger amounts of data being generated both privately in business and publicly over the web, data mining is becoming increasingly more interesting and useful. Learn how fuzzy set theory is better suited to handle some forms of uncertainty and vagueness.
  10. IBM Business analytics proven practices: Images in IBM Cognos BI reports and analyses: Get updated information, additional detail and troubleshooting techniques for issues regarding images not showing up in supported report output formats.

Find more deep-dive technical how-to content on big data and analytics tools and technologies at IBM developerWorks big data and business analytics.