In my first post I introduced the idea that most “big data” isn’t really big at all, and doesn’t conform to Gartner’s 3V’s. Instead, I've suggested that there’s benefit in focussing on “broad data”, or the use of many different sources of data to give us richer information. We put forward 4O’s of broad data – original, obscure, overlapping and augmenting. This post looks at how to use broad data, and where to find it.
How to use it
The defining feature of broad data comes from “the power of the join”: the release of extra value when we combine two data sets, bringing out new information that wasn’t available before.
Broad data gives us more attributes to describe our existing data , enriching it and increasing detail and accuracy. This can be translated into:
• better, more competitive pricing
• targeted marketing
• targeted interventions to our operations
• new insights to challenge our strategy
Logically, we should expect competition to drive more adoption of broad data, barring regulatory or cultural barriers. A good case in point is in motor insurance, where insurance companies will try to gather more and more data about drivers, to help them form a more accurate view of the risk, and hence a more accurate premium. Where that accurate premium is lower than competitors’ premiums it will gain profitable business. Where it is higher than competitors’ it will lose unprofitable business. Insurers are turning to telematics - basically in-car monitoring of drivers’ habits - to gain even richer information. A developing issue is the resistance of customers to this form of monitoring.
Where to find it
One of the more exciting aspects of broad data is the prompt to look externally for original, obscure sources of data. The most widely talked about sources these days are from social media, such as Twitter, which is certainly useful for customer sentiment analysis and management, but should be considered for broader use - e.g. monitoring mentions of illness, etc. More mundane, but better refined sources would be official, publicly available sources such as the UK’s Office for National Statistics, and broader data from government agencies. There are many commercially available data sets of potential value, such as financial market data, weather or traffic data. We can see companies beginning to understand that they can monetize (i.e. sell) their internal data sets to other companies, and we should expect more of these less obvious sources to come into play as broad data. Examples might be supermarkets‘ loyalty card data, telecoms location data or call patterns, companies‘ own sales by region or market segment. All of these could be extremely useful to other companies. It will be interesting to observe how companies may form new partnerships to share each other’s data in new ways.
However, we shouldn’t forget internal data. Especially in large enterprises, there may be many obscure data sets that are unused but could cast light on customer behaviour, or marketing preferences for example. This could include emails to and from intermediaries or customers, or it might be patterns of customer orders over time. Many enterprises are not even able to identify which customers have bought which products through which sales channels yet, so there is often some foundational work still to do with internal data.
Some of our internal data will be structured, and exist in well-ordered databases, but much will be unstructured, such as scanned documents or emails, or telephone transcripts. Increasingly, technology is enabling us to incorporate these new sources of data in our analysis. Externally, most social media would fall into this category.
The term " big data", however misleading, is here to stay. We can accept it while recognising that much of the new approach to data is not really about size at all. Instead it’s about finding and combining new sources of data into our core analyses in order to gain advantage. In our own projects and enterprises we can look for broad data opportunities, using the 4O's to prompt us. Small can indeed be beautiful, especially if it’s broad.