Experimentation as a Corporate Strategy for Big Data

Guidelines for a smarter approach to learning about new technologies

Solution CTO, IBM

InformationWeek recently ran an article entitled “Executives Push Big Data Projects, Not Sure Why.” The gist of the article is that few companies are run by executives who understand either the technicalities or purpose of big data well enough to help their companies profit from it—and as a result, there isn’t a good strategy or even legitimate business drivers behind their big data work. I posted the article in one of the LinkedIn groups that I run for business executives, and as I expected, it prompted some heated conversation. The reactions of the group basically fell into two camps. One side believed that doing anything without a specific reason and supported by a clear and direct ROI is a fool’s errand. The other side said that utilization of big data technologies is inevitable, so it is important to begin learning since it is better to make strategic decisions after having worked with the technologies in question. I’d like to suggest that the truth is right in the middle of these two camps. What the InformationWeek article missed was that while you don’t necessarily want your CEO deep into how the Hadoop or Streams processing models work, you do want your CEO pushing the organization toward paradigm shifts that these technologies enable. And those paradigm shifts are inherently disruptive, so jumping in blind is not exactly a best practice. Conventional, mature technologies should not be deployed without a clear understanding of what you want to do with them and why they’re the right tools for the job (see my previous article, “Fit-for-Purpose Architectures,” for more on the reasons behind this idea). However, if you’ve never worked with a particular technology, it's hard to know exactly what it’s the right fit for, so experimentation should be encouraged. From a strictly pragmatic point of view, you have to sponsor experimentation—and no amount of reading or meetings can adequately substitute for hands-on experience. But you have to be smart about it. I’d strongly suggest you tie your experimentation to real-life problems and time-limit the experimentation. To that end, I’ve developed a methodology for use-case selection that stresses the following:

  • Follows a proven path
  • Can be done offline and nondisruptively to existing systems
  • Intuitive that there is low-hanging fruit for additional insights
  • Data set is already stored, but under-instrumented or overly summarized
  • Initial findings can be arrived at in three weeks or less
  • Initial use cases have to be accretive to the next set
  • Initial use cases have to leverage common technology for the next set of use cases

Rather than try to cover all of those selection criteria and why they matter in one article, we’re going to set them up as an ongoing set of discussion topics over the next several months. Expect to see the notion of fit-for-purpose architectures threaded through these posts, since you can’t really separate that notion from smart use-case selection. [followbutton username='thomasdeutsch' count='false' lang='en' theme='light']

 <table cellpadding="0" cellspacing="0" valign="top" width="15%>

  [followbutton username='IBMdatamag' count='false' lang='en' theme='light']