The Big Data Paradox: More Data, Less Confidence

Confidence is not an all-or-nothing attribute

Program Director, Analytics Platform Marketing, IBM

Most of us can remember a time when we thought that more data would bring more confidence. Business people wanted more information about their customers, their competitors, and the marketplace—and when they got it, they made their decisions with more assurance.

Today, in the era of big data, individuals and organizations are so overwhelmed with information that those assumptions are being turned upside down. More data can lead to less confidence. When we needed to get our own internal systems in order, that was a challenge we could understand. We needed to rationalize information across our CRM, ERP, customer support and other systems—not an easy task, but one we could define and manage.

Now, confronted with the need to rationalize all that data and then to layer on mountains of additional information from constant meter readings, contractor inputs from mobile devices, and unpredictable comments from social media, we may decide there is too much information—too much, at least, for rational and confident analysis.

In a 2012 research report, Aberdeen Group noted that "the more data sources a company uses, the lower the trust in their data becomes."1 That lower trust can diminish the value of the analytics that every organization needs to enable smarter decisions, respond rapidly to change, and make rational forecasts about what is to come.

Consider what happened when the CEO of a telecommunications company lacked confidence in the data in his reports. He complained to his CIO and the business intelligence team, saying, “I can’t run a company if I have two different revenue reports that show different results. Fix it!” In fact, one set of reports was based on the company’s ERP system; the other, on the transactional ledger. One report conformed to generally accepted accounting principles (GAAP)—for example, including accruals and calculating maintenance revenue as an evenly distributed revenue stream—while the other did not. In actuality, both reports were correct. But without an understanding of the purpose and context of the reports, the CEO lacked the confidence needed for key decisions.

Moving to a point of confidence in data can mean better business outcomes at a broad level. But before that, it can mean increased success within the company as business people increase their trust in their reports and their forecasts.

So how can an organization increase confidence in information in a world of big data? One proven approach is to implement information governance. With governance, you can understand the data acquired from diverse sources, view its complete lineage and determine an appropriate level of trust, and monitor and protect sensitive data.

As we’ve discussed in the last four columns in this space, information governance isn’t a monolithic initiative that needs to be undertaken all at once. Instead, it’s a programmatic approach to increasing confidence in data that can start from several different points. For example, some business scenarios lead to the creation of a common business glossary as an entry point to governance. Sometimes taking control of master data is key. In other organizations, at other times, managing the data lifecycle or protecting sensitive information is the perfect starting point for information governance. Regardless of the entry point, the level of confidence in enterprise information increases along the governance journey.

Confidence isn’t an all-or-nothing attribute. You may always have more confidence in data from your corporate financial systems than in random comments monitored on Twitter. But that doesn’t mean that social media input has no value. In fact, a consistent pattern of complaints on social media about a particular product feature might be a very strong indicator of a problem in need of attention. So it’s worth applying an appropriate level of governance to that information as well as the more trusted information from within your enterprise.

Watson: Confidence makes the difference

In 2011, when IBM’s Watson supercomputer defeated the reigning champions of the Jeopardy game, confidence was a critical factor. Watson’s success—like the success of human winners at Jeopardy—depended on not only an ability to process enormous volumes of information to find potential answers, but also an ability to assess confidence with both accuracy and speed.2

One of the most intriguing aspects of the performance to TV viewers was Watson’s discernment of a level of confidence in each answer—and the visibility into that confidence level, as percentages were displayed to the audience. I don’t pretend to know the complex algorithms that went into that confidence assessment, but it’s clear that confidence was not black or white, and that confidence made a significant difference in the results of the game.

You may not have the tools at your disposal to fine-tune a confidence assessment as Watson did (although commercial applications of Watson are becoming available), but what you can do is to apply data governance principles to increase your confidence in the data you have, and thus to improve business results.

What impact—either positive or negative—has the confidence factor had on your business? Please share your thoughts in the comments.

1 Aberdeen Group, “The Big Data Imperative: Why Information Governance Must Be Addressed Now,” December 2012.
2 David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager, Nico Schlaefer, and Chris Welty, “Building Watson: An Overview of the DeepQA Project,” AI Magazine, Fall 2010.

[followbutton username='paulawilesigmon' count='false' lang='en' theme='light']
[followbutton username='IBMdatamag' count='false' lang='en' theme='light']