Why Big Data Doesn’t Require a Big Idea

Bigger isn’t better when choosing your first big data project

Solution CTO, IBM

A few months ago, I wrote an article arguing in favor of controlled experimentation as a corporate strategy for learning about big data. This approach flies in the face of the common misconception that companies should only embrace mature technologies with clear ROI. This month, I’d like to examine another big data misconception: the myth that leveraging big data demands a big idea. Sure, big ideas are fun. Some big ideas really do change the world, thankfully. But when you really dig into how big ideas are operationalized, it becomes clear that good old-fashioned hard work rules the day. I know this idea isn’t consistent with all the ill-informed hype—but unlike the hype, it happens to be true. I was reminded of this recently by yet another LinkedIn exchange on a Forbes article by Bob Evans. In the article, Evans gushed about a piece that Constellation Research CEO Ray Wang published on a Harvard Business Review blog. The gist of the article is that there are only three buckets that big data opportunities fall into:

  1. Differentiation. Wang argues that “big data offers opportunities for many more service offerings that can reinvent the customer experience based on better relevance of the experience.”
  2. Brokering. In support of extreme personalization, Wang says “these new analysis and insight streams could be created and maintained by information brokers who could sort by age, location, interest, and other categories.”
  3. Network monetizers. This category covers finding new ways of using personalized information and delivery mechanisms. For example, “large wireless carriers can map traffic flows down to the cell tower. Using this data, carriers could work with display advertisers to optimize advertising rates for the most popular routes on football game days based on digital foot traffic.”

Wang's three opportunities are fine, but they also feed into the hype that big ideas are the only place to start. This just isn’t true. In my experience, the pragmatic use cases are a much better place to start. I know it can be more interesting to focus on big ideas right out of the gate, but in most cases, the right opportunity is a modest and pragmatic one. Swinging for the fences first time up is simply NOT a best practice. In fact, it goes directly against the project methodology I created based on all of IBM’s years of big data project work. Going after a big idea as your big data starting point may work for a venture-funded firm whose whole existence is based on a swing-for-the-fences new product. But for the vast majority of enterprises, it is simply bad methodology. I’d also strongly suggest that the Network Monetizer idea is about to come under serious pressure from privacy considerations (more on that later this year). I'm not saying that Wang’s three opportunity buckets are conceptually incorrect, but he is skipping over dozens of near-term better places to start. Sometime business users just need to be able to run their reports faster—and there is nothing wrong with that. Perhaps you can make a case for differentiation as a place to start (provided your goal is to walk, not run, by simply understanding customer behavior rather than trying to comprehensively reinvent the customer experience and/or how the company functions). But if reinventing the whole company with your first big data project is an iffy idea, where do you start? First, brush up on Fit for Purpose architectures. Then keep these guidelines in mind:

  • Boil a bathtub, not an ocean
  • Pick a proven path
  • Make sure your project can be done offline and is non-disruptive to existing systems
  • Ensure that there is low-hanging fruit for additional insights
  • Use a data set that is already stored, but under-instrumented or overly summarized
  • Choose a project where initial findings can be arrived at in 4 weeks or less once the data is ready
  • Make sure your initial use cases are accretive to next set of use cases
  • Leverage common technology for next set of use cases

More on these ideas will follow in future columns. In the meantime, I’ve recorded several webcasts that cover these topics in an interactive format. So what do you think? Does this all make sense? Do you have different or better ideas to propose? Let me know in the comments. [followbutton username='thomasdeutsch' count='false' lang='en' theme='light']

 <table cellpadding="0" cellspacing="0" valign="top" width="15%>

  [followbutton username='IBMdatamag' count='false' lang='en' theme='light']