Can Master Data Management and entity analytics be self-service?

Director and Distinguished Engineer, Offering Management, IBM

During the past few years, many have observed traditional Master Data Management (MDM) evolve over time, especially when discussed in relation to analytics. Historically, MDM projects have focused on creating a single view of the truth for consumption by downstream business processes. By managing and de-duplicating related entities from multiple sources across medium and large enterprises, one can attain significant cost savings and competitive advantages in any one of several ways: 

  • Providing 360-degree views of customer information for call center reps
  • Enriching the customer onboarding experience directly to clients
  • Supporting the product introduction lifecycle
  • Providing mechanisms to identify fraud 

As a result, MDM continues to be a core technology foundation for multiple initiatives. However, in the age of big data and analytics, organizations are looking to gain a greater competitive advantage by utilizing their master data investment to provide added trust and accuracy to their analytical systems.

Analytics are only as good as the data you feed into it. If your analytical systems are fed by disparate, poor-quality, duplicated data, then the analytics and business decisions that are made from those analytics could lead to incorrect and ultimately damaging decisions.

Garbage in equals garbage out

Organizations are looking to leverage their MDM projects to extend beyond just operational usage into analytical usage, which IBM refers to as entity analytics. Being able to utilize a single view of key business entity data and knowing it is a high-quality, trusted and highly accurate source available within the enterprise can significantly improve the understanding of the other data you’re utilizing for analytics. This entity data could include account, customer, employee, partner, product or supplier information.

Master data used within an analytical context can connect the dots between previously unknown relationships among entities such as who lives in the same household, works for the same company or bought the same product. It can unlock and make sense of otherwise dark data assets by embellishing the data assets with attributes from the master entities, to provide context that makes the data assets understandable and relevant.

A perfect storm of innovation

As a result of the recognition of the power that master data can bring to analytics, a number of innovations and evolutions can support the move to entity analytics.

Technological innovations

Graph database technologies that support effective storage, discovery and manipulation of relationships between entities are seeing a marked rise in interest. Machine learning advancements enable data to be automatically discovered, classified and managed. And the growth of the application programming interface (API) economy can open up data to application developers for access in new and innovative ways.

Raised user expectations

A new breed of technology-savvy business users expect to be able to get access to data with a set of self-service tools and get on with their jobs without the need for IT. They have a job to do working directly on the data, and they don’t have the time or will to wait on anyone or anything else. In addition, traditional MDM has been very specific in handling core business entities and attributes that were considered master data. In the world of entity analytics, this definition is blurring. If a business user wants to augment MDM data with operational or transactional data, then that user should be able to do so.

Data explosion

The volume of data now available to business users who want to run analytics on it continues to grow. Enterprise-wide data lakes now empower users to go fishing for data that previously would not have been available to them. Cloud data providers and open data sets are more available then ever before and offer significant value when being used in analytics. Add to this level of access the ever-increasing volume of social media data. And when you consider including unstructured data—documents, images and so on—you can see how a huge amount of new data is now available that can benefit from being related to the core master data when used for entity analytics.

Because of all this change and the increasing need to utilize MDM data for Entity Analytics, IBM is extending these capabilities to IBM Bluemix Data Connect a managed data preparation and data movement service. Data Connect is the main provider of data ingest capabilities for IBM Watson Data Platform. The latter also includes offerings for analytics, data persistence and deployment that enable data-driven professionals to work together to quickly find new and unexpected insights that can deliver business-changing results. With these new capabilities, Data Connect helps you cope with the changing industry environment and provide self-service Entity Analytics.

How does it work?

IBM is including its next generation Entity matching engine within Data Connect to provide self-service entity analytics. One can also extract data from the MDM repository in a format that can be easily consumed by Data Connect. This approach provides the ability for MDM data to be loaded into Data Connect.

From here, business users can use all the Data Connect functionality provided by the shaper and the canvas to prepare the data for analytics. Using Data Connect’s ability to connect to a growing number of different sources, additional data sets and future open data sets can then be pulled in alongside the master data. Using the MDM-matching capability, these data sources can then be cross-matched and de-duplicated to provide a single set of cleansed entity data that can be used for entity analytics.

Data Connect also includes a set of dashboards that provide an intuitive way to understand the quality of the data and data lineage to show how the engine collapsed the records from the various data sources. It can also understand the relationships that exist between the entities and can also discover non-obvious relationships.

With this new capability, IBM is making MDM data open for analytics to everyone. Master data is no longer a silent hero improving data for accurate and efficient business processes. It’s now being utilized to make optimized business decisions throughout the organization. Consider this example:

In this scenario, an MDM system is with another set of data that happens to be in a spreadsheet file. Suppose you want to run a marketing campaign to target high-net-worth clients to sell them a premium bank account. The information in the one MDM system in isolation doesn’t give you the needed information. As a self-service business user, you want to bring these two sources together and determine if you can identify individuals that can be targeted for the new account.

In the MDM system, John Smith lives with Mary Smith. The spreadsheet file shows that John Smyth—spelled differently—is actually a high-net-worth client. When bringing these data sets into Data Connect and running the MDM-matching engine, the data has been matched and de-duplicated and John Smith is actually the same person across the data sets. He’s a high-net-worth client, and he has a wife. The integrated dashboard can be used to understand who else is related to John Smith, given that you may decide that you want to target Mary Smith with this premium bank account because she lives with a high-net-worth individual. Entity analytics as part of Data Connect enables you to discover and understand this opportunity. Here’s another example:

In this scenario for a risk assessor in an insurance firm, severe rainfall is predicted within a geographical area that includes the client’s residential location. When pulling up the client data from MDM and the flood warnings being issued from the environmental agency, you can use Data Connect to bring this data together and match across the data sets. Using the output of the matching process and while analyzing the data with the integrated dashboards, you can identify that a number of properties are at risk. The client can then be provided an early warning to help mitigate risk and increase the flood risk value on the client’s property renewal. Also, notice that you also have an elderly customer that is at severe risk; you can therefore take action to notify the emergency service to ensure a proactive resolution to any potential threat.

As these two relatively simple scenarios demonstrate, master data can be put into the hands of business users to unlock the potential that can drive better analytics then ever before.

A seamless, self-service experience 

IBM Bluemix Data Connect provides a seamless integrated self-service experience for data preparation. Data integration and entity analytics help empower business users to gain insight from data that wasn’t previously available to them. Data Connect now provides a mechanism for organizations to extract further value from their MDM data by ensuring it is used across the organization to provide accurate analytics. In short, Data Connect provides several key features: 

  • The ability to run entity analytics on a plethora of data sets, including master data and non-master data sets
  • Cognitive capabilities to allow the MDM-matching technology to be auto-configured and tuned to intelligently match across different data sets
  • Dashboards to provide the ability to understand the lineage and quality of the data that has been matched by the matching engine, understand the relationships among the data and discover new non-obvious relationships within the data that were previously undiscoverable  

The innovations and evolutions covered in this post have allowed IBM to put master data into the hands of business users and provide MDM to everyone. MDM allows them to feed this master data into their analytics for more accurate outcomes. Entity analytics within Data Connect is now available in beta. Go ahead and experience the next evolution of MDM.