How data cataloging helps analytics cultures evolve

Director and Distinguished Engineer, Offering Management, IBM

According to anthropologists, around 10,000 years ago, human civilization began to undergo a dramatic change. Until that point, most human societies had been hunter-gatherers, living in small, nomadic groups that spent the majority of their existence searching for food.

The great revolution occurred when humans learned to domesticate various plant and animal species. Agriculture provided much more reliable access to food, enabling the nomads to settle and build larger and more collaborative communities, ultimately resulting in the rise of cities and the creation of sophisticated civilizations.

To use this as an analogy for business analytics, most companies today are still making their transition from the hunter-gatherer lifestyle to a more settled and scalable model. Pockets of civilization certainly exist, especially in companies that have built mature data warehouses and dug out data lakes, but most business analysts still spend much of their time out in the wild, desperately searching through different databases, spreadsheets and file systems to find the data they need.

Why do knowledge workers spend more time gathering data than analyzing it? How can enterprises open up access to data assets that live outside the data warehouse? And is enterprise-wide analytics possible without forcing everyone to use the same analytics tools?

From subsistence to abundance

The time and energy spent on gathering data is a huge waste of skilled resources. It limits the potential of business analysts and data scientists to collaborate on an enterprise scale. Instead, small teams of knowledge workers end up fragmented across the organization, each exploiting its own local data sources and wielding its own tools, but struggling to hunt down trustworthy information outside its own territory.

Just as early human societies realized, knowledge workers must find ways to domesticate and farm their data to build a sustainable, scalable and exploratory analytics culture. If they can be confident that the data they need will always available and accessible when they need it, they can settle down and focus on building solutions of real value to the business.

That’s where intelligent data catalogs such as IBM Watson Knowledge Catalog come in. Data catalogs help to herd tribal knowledge and manage all the different data assets across an organization, providing a simple, convenient way to find, access and democratize data.

Taming ungoverned data

IBM Watson Knowledge Catalog provides a framework for governing and controlling access to data sets, whatever they contain and wherever they reside. It’s quicker and easier to add a new data source to Watson Knowledge Catalog than it would be to transform and load it into a data warehouse, and there is no need to copy the data or worry about inflating storage costs. As a result, it’s feasible for organizations to bring all of their different data sets within the governed bounds of the catalog, providing a complete map of the entire company’s knowledge.

That means when knowledge workers — data stewards, data scientists or business analysts and others — need a piece of information, they know exactly where to go to get it. Moreover, unlike a random data set that they might have tracked down in the wild, they know that the data they find in the catalog is not going to bite them. With its lineage capabilities, Watson Knowledge Catalog provides a full pedigree for each data asset, so data citizens can immediately see where it came from, what it contains and whether it is safe to use.

Combining freedom with control

While Watson Knowledge Catalog helps business analysts and data scientists keep an organization’s data under control, it also gives them freedom to use the data in whatever way is most convenient for them. It releases the tight coupling between data sources and analytics tools, so regardless of whether a business analyst uses IBM Cognos Analytics, Tableau, Looker or PowerBI, they get the same easy access to all the information they need when they need it.

As a result, while Watson Knowledge Catalog breaks down the silos of data that have formed within the organization, it doesn’t do so at the expense of imposing a restrictive monoculture on users. As long as the data itself is protected and governed, data citizens can use whatever tools they like to transform that data and add value to the business.

IBM Watson Knowledge Catalog offers organizations a bridge from a fragmented array of isolated teams of knowledge gatherers to a scalable, sustainable culture of data citizens that fosters collaboration, enables more sophisticated analysis and activates enterprise data for artificial intelligence.

To learn more about IBM Watson Knowledge Catalog, watch this video or visit the website.