Prepare and maintain third-party data in hybrid environments

Portfolio Marketing Manager, Information Integration & Governance, IBM

Trusting third-party data is one of the biggest struggles IT professionals face when dealing with hybrid data environments. Even if a small percentage of your information comes from outside sources, its vital to transform it so it matches the quality of in-house data and can be accepted as a reliable asset across its lifecycle.

For data to be considered of high quality and serve your business effectively, it must be complete, accurate and readily available. Having regular access to data that embodies these three qualities can dramatically reduce your cost of operations by increasing efficiency, lowering the risks associated with bad data and boosting employee and customer confidence.

Preparing data

Accessing data in the cloud may be more trouble than its worth if that data is so untrustworthy that IT and business professionals have to repeatedly check it for accuracy. Erroneous third-party data also calls into question the quality of your in-house information, no matter how sound it may be.

So its critical to prepare all third-party data through data cleansing, which eliminates any duplicate or erroneous information. To build trust in the data, you should also establish a level of transparency so your data stewards can easily view the history of the information, including how it was acquired and manipulated.

Another important aspect of preparing your data involves setting up a system that allows your data stewards to test the data on a regular basis. By having access to a dashboard with intuitive testing capabilities your data stewards move faster to identify any pressing issues or threats.

Maintaining data

A popular vehicle for maintaining data is a master data management (MDM) system. While these systems traditionally focused only on structured internal data, todays hybrid data environments demand that MDMs incorporate a range of data sources, including:

  • Internal data: This consists of behind the firewall data.
  • External, trusted data: While you cant change this data you can properly format it and analyze it for accuracy. Dun&Bradstreet data is an example of external, trusted data.
  • External, untrusted data: This is lower quality data that still has some value. Information from social media platforms is one good example.

You need an MDM that can include all three of these data sources because each has the potential to yield information thats vital for staying competitive. For example, analyzing data pulled from social media outlets might reveal emerging trends among your younger customers that wouldve otherwise taken much longer to detect through traditional methods.

Monitoring data

Prepping and maintaining data will do little good unless youre actively monitoring it over its lifecycle to ensure that it remains accurate and trustworthy. Your data stewards should have the ability to quickly assess the health of your data and resolve any data issues before they lead to widespread, costly problems.

To keep your data stewards agile and responsive, equip them with automated workflows and a monitoring system that provides pre-built yet customizable data rules that work in batch and real-time streaming scenarios.

Securing the proper data solution

Download our e-book to learn more about preparing, maintaining and monitoring your data. And for guidance on data integration and governance, we invite you to explore this practical IBM Analytics resourceDiscover how the following suite of data solutions can improve your organization’s data quality by connecting with an IBM expert today.

  • IBM InfoSphere Information Server for Data Quality
  • IBM InfoSphere Master Data Management (InfoSphere MDM)
  • IBM Stewardship Center
  • Data Quality Exception Console