Decision Confidence: Where the Predictive Chickens Come Home to Roost

Big Data Evangelist, IBM

Business is all about placing bets on the future, having confidence that the odds are in your favor. You want to be confident that the consumer demand you anticipate will in fact materialize. You require trustworthy data to support your forecast that the product you’re developing will address a profitable, untapped niche when it comes to market. And you need predictive models, grounded in many years of historical data across several business cycles, to corroborate your belief that factor costs and currency exchange rates won’t eat up all of your profit when you bring your new offering to market.

Hunch-not-good-enough.pngOf course, there have always been intuitive business geniuses who have succeeded on the proverbial wing and a prayer. But how confident are you that you’re one of them? You might place great stock in your own gut feel and educated opinions. In fact, you might be 100 percent sure that whatever your “little birdie” tells you is the absolute truth. But in data-driven business, your personal hunch is usually not good enough. Indeed, you probably won’t be able to rally colleagues, partners and investors around your risky dream unless you have good numbers to back it up.

In modern business, you can’t inspire confidence in your plans if you don’t justify them with high-quality data analytics. At heart, what that means is that you need to have high-quality business intelligence (BI) and a “single version of truth” data warehouse to support decision making at all levels in the organization. Jennifer McGinn’s recent IBM Big Data Hub blog on the decision-support benefits of high-quality data provides a good infographic overview of this business imperative.

Just as important, high-quality decisions also depend on having confidence in the predictive models that drive both human and automated decision-making in your business applications. This depends, in turn, on whether you have leveraged trustworthy data as well as the best predictive models that have been “trained” and “scored” from this data. What this means in practice is that decision confidence depends on your business’ success in establishing best practices in BI, data governance, master data management, statistical modeling and data science.

None of this is rocket science. In fact, these are all mature bodies of best practice that many organizations have adopted in their data analytics programs. Rather than lead you on a forced march through them all, I’d like to call out the data-science best practices that are most fundamental to decision confidence. In a recent IBM Big Data Hub blog, I discussed the key practices that your data scientists must get right in order to strengthen the predictive power of their models:

  • Select the right analytic problem
  • Select the right subject population
  • Select the right data sources
  • Select the right data samples
  • Select the right data and model versions
  • Select the right predictive variables
  • Select the right modeling approach and algorithms
  • Select the right model-validation frequency
  • Select the right model-fitness approaches
  • Select the right visualizations

In this recent IBM Smarter Analytics blog, I also highlighted the need to have your data scientists regularly verify the soundness of each other’s predictive models and the quality of the data upon which those models were built. My key recommendation was to task, wherever feasible, non-overlapping teams of data scientists with independently vetting one another’s methodologies and attempting to replicate each other’s results.

In this IBM Big Data Hub blog, I also discussed the concept of “next best model,” which boosts confidence in decision-automation applications by auto-deploying the best-fit model into each prediction-driven business process. This helps to realize the goal of “next best action” – in other words, the ideal of taking the optimal action in each decision cycle in accordance with the best model, data and business rules at that point in time. One example of this, common in many companies’ multichannel marketing environments, is the ability to continuously auto-target each customer with the specific offer that he or she is most likely to accept.

This latter scenario is an even more powerful confidence-booster if you can instrument the predictive-driven process with models that are continuously adaptive, dynamic and self-learning from fresh feeds of data (e.g., which offers customers have accepted vs. rejected). Ideally, there should always be a “champion” (i.e., best fit) model in production, with one of more “challenger” models ready to be promoted to production if the champion’s predictive power decays (if, for example, the erstwhile champion fails to predict offer acceptances above a business-specified minimum confidence level). To make this scenario work in practice, you would need to continue to score all of these models—champion and challenger(s)—against continuous feeds of fresh information from line-of-business applications, enterprise data warehouses, Hadoop clusters and other data sources.

Clearly, this only scratches the surface of the multifaceted topic of predictive analytics’ role in decision confidence. As a take-away, what’s important to remember is that these best practices apply to business analytics requirements of any volume, velocity or variety, not just at big-data scales.