Measuring the Business Value of Big Data

Big Data Evangelist, IBM

Putting a dollar value on data is a very tricky endeavor. Data is only as valuable as the business outcomes it makes possible, though the data itself is usually not the only factor responsible for those outcomes. Doug Laney of Gartner provides a good discussion here of the challenges in attaching a monetary value to data. He refers to his approach as “infonomics.”

If you can’t put a meaningful value on data, in the abstract, you can hardly put a monetary value on big data. That fact hasn’t deterred some people, such as the author of this article. He includes a chart called “data value chain” that purports to measure the value of individual data items and of aggregated data sets on the same (undefined) scale. In his chart, the value (however measured) of individual items declines over time while that of aggregates grows. In other words, he essentially asserts that big data’s value grows over time commensurate with some vague metric of its volume and/or variety.

It’s hard to know what to make of this approach, which abstracts the aggregate’s value from any notion of its business application. I think it’s better to focus on data’s instrumental value in decision support, which is, after all, the core function of traditional business intelligence and of a lot of big data, advanced analytics and data science applications as well.

If we tie data’s value to its potential in supporting decisions that lead to positive business outcomes, we have a sounder basis for valuation. In such a framework, we can measure the value of each individual datum and of the aggregate on the same scale.

How can we tie this back to putting a monetary value on big data? One high-level approach might be to consider the decision-support value of the aggregate along the “four Vs.” What follows is a look at how you might measure the customer lifetime value (CLV) impact of big data used for customer relationship management:

  • Volume-based value: The more comprehensive your 360-degree view of customers and the more historical data you have on them, the more insight you can extract from it all and, all things considered, the better decisions you can make in the process of acquiring, retaining, growing and managing those customer relationships.
  • Velocity-based value: The more customer data you can ingest rapidly into your big-data platform and the more questions that a user can pose more rapidly against that data (via queries, reports, dashboards, etc.) within a given time period prior, the more likely you are to make the right decision at the right time to achieve your customer relationship management objectives.
  • Variety-based value: The more varied customer data you have – from the CRM system, social media, call-center logs, etc. – the more nuanced portrait you have on customer profiles, desires and so on, hence the better-informed decisions you can make in engaging with them.
  • Veracity-based value: The more consolidated, conformed, cleansed, consistent current the data you have on customers, the more likely you are to make the right decisions based on the most accurate data.

How can you attach a dollar value to any of this? It’s not difficult. CLV is a standard metric that you can calculate from big-data analytics’ impact on customer acquisition, onboarding, retention, upsell, cross-sell and other concrete bottom-line indicators, as well as from corresponding improvements in operational efficiency.

Clearly, what we’re valuing with such a framework is not just the customer data, but also the entire set of customer data management, governance and analytics practices. In fact, it’s pointless to put an economic value on the data itself if you fail to sustain this entire body of best practices. The scale of the data says nothing about its fitness to support high-quality business decisions. Data’s quality and potential business benefit degrades to the extent that you slack off on governance.

Small data can have more value than a corresponding big-data collection. A tiny, but well-governed, data set might have greater quality, hence monetizable value, than a massive, but poorly governed, database. In fact, a sample from a big-data collection might have as much or more economic value as the entire petabyte-scale databases from which it was extracted. If the sample is representative and was used to develop a highly predictive correlation model, it might be possible to purge the vast majority of the big-data population without diminishing the business value of the derived artifacts: sample, model and model-based narrative (the latter might be equated with the “story” told by a “data journalist,” per my recent blog).

You might even be able to impute a value to an individual data item under this general approach. You would need to apply “for want of a nail the kingdom was lost” logic, but it can be done. Sometimes, the exact right piece of low-level data at the right time can make all the difference. Under this admittedly uncommon scenario, you might consider the datum the golden “needle” and the big-data collection from which it was extracted the occasionally bewildering “haystack.”

If you’re going to the logical extreme, you might impute infinite value to your own intuition – aka “gut feel” – which can produce powerful decisions in the absence of any data and analytics of any sort. But I wouldn’t advise that approach. Your genius intuitions may be indistinguishable from lucky guesses or sheer madness.

Sometimes it’s best to say a little bird told you.