Next best expert: Collaboration of people and machines on big data and analytics

Big Data Evangelist, IBM

In the cloud, crowdsourcing means tapping into all of the expertise that's in the ecosystem. This world's problems are too large for any one person to have all the answers. Likewise, the challenges confronting humanity are growing too complex, dynamic and multifaceted for any one algorithm, statistical model or machine-learning program to address definitively.

Creativity is the ability to find solutions to unfamiliar problems using any resources at your disposal--and that includes any collaborators you can engage. More and more, the "crowd in the cloud" is some ad-hoc constellation of people and machines. With advances in cognitive computing and machine learning, people are starting to realize that, within specific scenarios, computers—such as IBM Watson—can be as creative as any human. Considering the complexity of the algorithms involved plus their ability to learn from fresh data and dynamic circumstances, these automated systems can generate unprecedented outputs that might not have occurred to any human expert beforehand.

One touchstone in this direction is the emerging notion of computational creativity, as discussed in this MIT Review article. A computational process might exhibit creativity, say IBM researchers, if it can generate "a product that is judged to be novel and also to be appropriate, useful or valuable by a suitably knowledgeable social group.” In other words, creativity can be an aptitude of automated systems, but only humans can truly judge whether the criteria of novelty, appropriateness, usefulness and value are met by computationally-generated outputs.

People and computers need each other. As Walter Frick stated in this recent Harvard Business Review article, "in many areas, the combination of human and machine intelligence will outperform either on its own."

In the era of big data and cloud computing, we need to stretch our notion of creativity to reflect the increasing co-equal and co-dependent collaboration of humans and machines. In the open cloud marketplace of expertise, we need to accept the fact that the smartest person in the virtual room may not be a person. As we address more formidable big data challenges, we need to cast our net broadly to include the creative contributions of analytic models, cloud services and other programmatic resources in the final product. Perhaps they can see predictive emerging patterns in petabytes of data faster than the mortal human. If so, they may have already calculated the most optimal solutions.

And we need to stop torturing our imaginations with the crazy notion that machines will enslave us to a fetal eternity as hollowed-out fleshpods in "The Matrix." As I noted in this Dataversity article from October 2013, "machine learning is a productivity tool for data scientists, helping them to get smarter, just as machine learning algorithms can’t get smarter without some ongoing training by data scientists."

Frick makes an equivalent point in his HBR article: "When experts’ subjective opinions are quantified and added to an algorithm, [the algorithm's predictive] quality usually goes up. So pathologists’ estimates of how advanced a cancer is could be included as an input to the image-analysis software, the forecasts of legal scholars about how the Supremes will vote on an upcoming case will improve the model’s predictive ability, and so on."

However, the benefits of give-and-take collaboration anong human and machine are not always clearcut in practice. According to Frick, when human experts attempt to second-guess the output of algorithms which they've had no hand in tailoring, the results may be counterproductive. "When experts apply their judgment to the output of a data-driven algorithm or mathematical model (in other words, when they second guess it), they generally do worse than the algorithm alone would."

Crowdsourcing should involve finding the next best expert—human(s), machine(s) or some partnership among any and all—depending on the problem at hand. Who'll write the rules that determine who that next best expert is, in any particular circumstance? It may never be possible to write hard-and-fast rules in this regard. At some level, ad-hoc expert judgments may be needed to evaluate whether a machine-learning model has bested a human in identifying to the solution to a thorny problem.

From an operational standpoint, the crux of the issue is this: Will your experts have enough humility to defer to superior machine-generated solutions in various circumstances? And will that be acceptable to the stakeholders who rely on those human experts' judgments?