Hidden Biases That May Cloud Cognitive Computing
Consign scrutiny of unconscious biases that data scientists bring to their analytics and algorithms
Everybody knows that unconditional faith in data is naïve. I dissected this issue in a recent two-part series, “When and When Not to Have Faith in Statistical Models,” in IBM Data magazine.1 Unconditional faith in reasoning is also dangerous. If reasoning were the gold standard for all modes of thinking, then evolution would have long ago produced a superlogical superspecies that puts Star Trek’s Mr. Spock to shame.
By reasoning, I’m referring to cognition: the conscious, critical, logical, and detailed thought-processing faculties inherited by all Homo sapiens. Many people regard cognition as if it is some higher faculty than all the rest—for example, emotion, sensation, volition, self-preservation, and so on. From an evolutionary perspective, ranking any of these faculties as more important than others is difficult, because the ability to adapt to environmental changes depends on seamless orchestration among them all.
Impact of bias
Even if we single out cognition for special prestige, venerating it unconditionally is hard when we consider the range of biases that hobble it continually on many levels. Automating and accelerating it all with cognitive computing doesn’t do away with the biases. Instead, they make their impacts more acute because we’re more likely to ignore them when our clouds deliver cognitive processing results far more rapidly, extensively, and consistently than we mortals could ever manage on our own.
Where cognitive computing is concerned, we must place special scrutiny on the unconscious biases that data scientists—the primary developers of these models, analytics, and algorithms—build into their handiwork. As I discussed in a blog last year, several levels of bias are endemic to data science work—no matter how brilliant and disciplined the individuals doing this work happen to be.2 These levels consist of biases in cognition, selection, sampling, modeling, and funding.
In a recent blog, Scott Berkun echoed these concerns about bias in data science, and added some interesting twists to the discussion. In particular, he said the critical role of evidence-driven narrative is shaping—perhaps skewing—how we interpret the outputs of cognitive processing. Berkun said, “No matter how much data you have, you will still depend on intuition to decide how to interpret, explain, and use the data.”3
This statement echoes my concern with how data scientists focus, or fail to focus, on the conjoined model-building and narrative-building responsibilities in their development work. As I’ve said elsewhere, “For the data scientist, visual patterns serve the core narrative-building functions of framing the opportunity, problem, and solution. If you are both a data scientist and a decision scientist, many of the data-based patterns you call out concern regularities in human decision-making behavior. Past behavioral patterns in some target audience, factored as evidence into the model-based narrative, can bolster a case for or against some course of action that seeks to influence those behaviors.”4
If data scientists skew the evidence-based narrative with some unconscious bias, they’ve thereby weakened the utility of whatever cognitive-computing model it corresponds to. Likewise, if data scientists try to argue that the data upon which they built that model and narrative is free of bias, they’re distorting the irreducibly intuitive core of their work. Because where there’s intuition, there is judgment; and where there’s judgment, there’s bias.
For example, as Berkun notes, more data scientists are using A/B testing to generate the data that helps them build, score, and iterate their models. He notes that, “In A/B testing, you use intuition to decide what B is. Underneath all of our rational intellect is intuition, which influences our ‘rational’ behavior far more than we admit. Often, data yields unavoidable trade-offs where two or more options are equally viable and someone must make a judgment call beyond the data.”5
Clearly, unconditional faith in data scientists is just as bad as unconditional faith in data, reasoning, and cognitive computing. Biases cloud their cognition, hence their work, just as they do our interpretations of the outputs from their work. But that’s not necessarily always bad. The best data scientists have a bias for results. They earn their pay because they use educated judgments to identify the data, algorithms, and models best suited to a particular cognitive-computing challenge.
Please share any thoughts or questions in the comments.
1 “When and When Not to Have Faith in Statistical Models: Part 1 and Part 2,” By James Kobielus, IBM Data magazine, January 2014.
2 “Data Scientist: Bias, Backlash, and Brutal Self-Criticism,” By James Kobielus, Big Data & Analytics Hub blog post, May 2013.
3,5 “The Dangers of Faith in Data,” By Scott Berkun, blog post, November 2013.
4 “Decision Scientists? Big Data as Evidence in Building a Business Case,” By James Kobielus, Big Data Integration LinkedIn group discussion, May 16, 2013.
|[followbutton username='IBMdatamag' count='false' lang='en' theme='light']|