Nurturing open marketplaces for predictive models and modeling expertise

Big Data Evangelist, IBM

Anybody can be Nostradamus, and I don't mean that in a good way. Just like that medieval French mystic, you or I could easily spend our lives spinning grandiose, alluring and unverifiable visions of the future. If you swirl in just the right amount of symbolic apocalyptic hokum, you can never be disproved, no matter how the future unfurls. People in coming centuries can read whatever they wish into your prophecies.

Predictive analytics is the exact opposite. It's essentially the quantitative scientific process incarnate, not qualitative speculation run amok. In the world of statistical analysis. Your predictive models are 100 percent grounded in historical data. Not just that, they make quantitative assertions about specific future scenarios (for example: a certain cluster of independent variables will cause a specific percentage of specific types of customers to accept a specific offer in a specific week) that can be verified if and when they actually come to pass.

Consequently, it's not necessarily true that simply anyone can do high-quality predictive modeling. They need the training, tools and, of course, the data to go further. And it couldn't hurt to have the personal aptitudes that I discussed in this post. There's a clear learning curve where predictive modeling is concerned, which means that, as demand for data scientists and other skilled predictive-modeling professionals continues to grow, the supply of qualified individuals will catch up only after a considerable time lag.

However, the dreaded data scientist shortage crisis apocalypse is much exaggerated, contrary to the headline of this recent article. As I stated in this blog from last year, the skills deficit isn't as dire as it's made out to be. The reasons why are several.

  • First, data scientists are reusing more of their work and, as a result, needing to develop fewer models from scratch.
  • Second, data scientist skills are increasingly being sourced, as needed, from external professional services firm and being developed in-house.
  • Third, more organizations are establishing data science centers of excellence to nurture and grow this talent internally.
  • And, last but not least, data science autodidacts (self-taught, uncredentialed, data-passionate individuals) are playing a more significant role in many organizations’ predictive brain trusts.

This recent article more or less corroborates that perspective. Author Madhu Reddy identifies three broad trends that, going forward, will minimize the likelihood of predictive-modeling skills shortages from now on:

  • Creating more data scientists via massive open online courses (MOOCs)
  • Providing predictive analytics tools for non-data scientists via self-service visual exploration tooling
  • Matching supply and demand for predictive models by establishing open marketplaces within which they can be bought and sold like any other monetizable intellectual property

That last point, the development of predictive analytics marketplaces, is the most fundamental. As Reddy states, "A prediction marketplace enables a data scientist to create a model just once and then repeatedly customize the model, rather than start from scratch each time. With an efficient marketplace, over the same period, a data scientist can build many more different models for different customers, compared to repeatedly recreating the same models from scratch."

In addition to buying and selling predictive models in the open market, I'd also add that open-source data-scientist expertise markets are fundamental to this vision. As I stated in this blog last year, "Big data will leverage the most open arena of all, “crowdsourcing” cloud approaches, such as Kaggle and TopCoder, to pool the world’s expertise (or at least that of all the smart people in your company and value chain) in wide-ranging development, investigation and exploration of analytics—and data-infused business problems from all conceivable angles."

Predictive marketplaces should allow us to reuse the models and the modelers themselves within our data science initiatives. Even as data science skills become ubiquitous in the modern world, there will still be the need for efficient marketplaces to match up the supply and demand for their services.

Open marketplaces are an inexhaustible source of economic dynamism. Adam Smith, not Nostradamus, was the seer who saw this vision most clearly.