Big Data. Big Impact.
I had the honor this week of speaking with Om Malik of GigaOm on stage at the Structure: Big Data event in New York City. This was a first of its kind event, bringing together an incredibly interesting group of entrepreneurs, enterprises, industry luminaries, investors and press to discuss the state of the “Big Data” revolution that manifests itself throughout the industry. I must say, I feel “vindicated” by all this activity. I have been talking for years about “category convergence”, suggesting the convergence of business analytics, data management, search, data warehousing, ETL, text analytics, data protection, and a few other “categories” as necessary to create the business value we all expect from these technologies.
Let’s go back in time. Historically, we have always talked about “structured data” and “unstructured data” as two, independent, separate things. That segregation is incredibly important as all this “data” is at the root of everything we are doing to make sense of it, improve business and societal decision-making, and create some type of sustainable value from this ever-increasing asset. Yet as a result of this thinking (or perhaps at the root of this dichotomy), the technologies to extract knowledge and insight from these data assets have evolved along largely separate paths. Think databases vs. search.
I sometimes think about “why” – why did our view on data evolve this way? Industry experts will tell you that “structured” data makes up only 20% of the data contained within an enterprise. In the consumer, internet domain, it’s far less than that. So called unstructured data (audio, video, raw text, etc.) makes up something like 80% of the data we produce. Interestingly, the level of technology innovation in analyzing an extracting value from structured data appears to have, historically, greatly outpaced that of unstructured data analysis. I think that’s likely due to two primary phenomena. First, unstructured data is hard to deal with. It requires lots of very sophisticated algorithms and extraordinary compute capability to make sense of it. Second, the “structured data” is the data that big enterprises “invest” in, meaning they spend a great deal to structure it in the first place, and then to store, organize, cleanse, maintain, and analyze. After all, it is the data that defines the most tangible essence of the business (transactions, customers, suppliers, partners, …). So why not invest here first?
Invest in structured data we have. We have entire market segments built around these concepts. ERP, CRM, Data Warehousing, Business Intelligence, just to name a few. But there remains so much to be learned from those “other” data assets we produce and preserve. If structured represents the most tangible “essence” of a business, the rest of the data tells the semantic story. And there are so many insights yet to be gained. If structured data led to the massive ERP and CRM industries, I have to wonder where Big Data will lead us over the coming years.
This is where I get excited about what we see happening in “Big Data”. Most of the applications I have seen so far are focused on consumer facing businesses and traditional enterprises leveraging structured data assets. Data warehouses supporting reporting, ETL jobs running in Hadoop, “machine to machine” analytics making decisions in real time. We have just scratched the surface and the opportunities to go further are immense. The mere fact that we, as an industry, are focusing on this idea of “Big Data”, and how to get insight from it, is extraordinarily rewarding. Not only are we bringing the ability to analyze and understand data to the masses, we are creating new businesses and business models that simply could not exist without these trends. We are finding new ways to create great value for companies who are looking at their data in new ways, gaining new insights, and optimizing their outcomes for a predicted future. We are also finding exciting new ways to help our global society though amazing new genomic and drug discoveries, reductions in energy consumption and lots of other breakthroughs.
So to me, Big Data is all about Big Impact. We in the technology industry have been working on new technologies, practices, and businesses to create compelling value to businesses and consumers. We’ve come up with some pretty good stuff so far but in my view, the best is yet to come. I will even suggest that the value we create from “Big Data” will eclipse the business value generated from ERP and CRM combined. Stay tuned. This is going to be a very exciting space for years to come.