Don’t Overhype Data Science Expectations

Why it’s important to keep outsized data science claims in perspective

Solution CTO, IBM

Someone once told me that expectation management was the key to a happy life. Without getting into a debate about how valuable that advice is as life wisdom, I will say that it definitely proves useful for managing big data projects. As I have written here before, getting fixated on “bet the company” sorts of projects and outsized results is poor expectation management. From the discussions I’ve been involved in, probably the greatest need for expectation management in the big data space right now revolves around the “magic” of data science. Of course, data science isn’t magic at all; rather, it’s hard work and lots of preparation. But you wouldn’t know that from much of the online chatter about the topic. Those of you who know me are probably already guessing that I’m going to slip into debunking mode now—and you’re right. Consider the public assertion made recently by an executive at an IBM competitor. I’m paraphrasing, but he basically said that data scientists can often drive orders-of-magnitude increases in the efficiency of solutions through rapid iteration. I almost fell out of my chair when I read that. Orders of magnitude? That is definitely not honest expectation management. Has a data scientist ever found a way to make orders-of-magnitude improvements somewhere? Sure, probably in some massively screwed-up environment. What about in an existing, well-run business? Perhaps occasionally. But in an organization that already has a seasoned data analytics or business intelligence team? I’d argue that it happens very rarely. A quick check of the math bears this out. An increase of one order of magnitude is the same as multiplying a quantity by 10. Doing something 10 times better in a functioning business that is already running at scale is really hard. If you could easily convert 10 times more prospects to customers, you would be doing it already. Take, for example, Google’s constant experimentation. They tinker with the goal of improving things by tenths of a percentage point (which, given the scale of their business, clearly makes a meaningful impact). They are not aiming for tenfold improvements. Setting the expectation that you’ll grow revenue by 10 times, produce widgets with one-tenth the number of defects, or achieve 10 times better retention when your company is already operating sufficiently is just not realistic. Another good example of this came up in a conversation I had recently with the head of analytics for a major B2C firm. The folks at this firm are very smart, and they are at a point where they can identify an individual using bits of consumer information from multiple sources about 20 percent of the time. We’re about to start working with them using the next generation of IBM® entity resolution and big data technologies, and we want to help drive their positive identification rate to 22 percent or even 24 percent over the next 12 months. As with the Google example, this improvement would have a big impact on their business—but orders of magnitude? No way. I would never try to explain the potential of what we can do in terms of orders of magnitude. And you shouldn’t either. Don’t fall victim to the hype. Instead, have the people who actually do the work volunteer the target improvements, and then sanity-check those measures against how your business actually runs. You will be far better off by targeting modest, incremental, and sustainable goals than setting unrealistic expectations. What do you think? Let me know in the comments.   [followbutton username='thomasdeutsch' count='false' lang='en' theme='light']

 <table cellpadding="0" cellspacing="0" valign="top" width="15%>

  [followbutton username='IBMdatamag' count='false' lang='en' theme='light']