Data masking factories for the big data era

Product Marketing Manager, IBM

A 2014 ended on a fairly negative note from a data security perspective, with a record number of data breaches tracked and listed in the US. Sony Pictures Entertainment cyberhack was the last 2014 addition to this list. Even as details about the motive of the hackers and the damage they might have caused are emerging, there are reports that employee personal data including Social Security numbers as well as medical and salary information may have been compromised. This incident brings back into sharp focus the topic of data security.

On a more positive note, more organizations are adopting a proactive strategy to protect enterprise data with security as a corporate standard. There is a growing trend among larger companies to cultivate mature and repeatable processes. One such trend is the use of data masking factories. Marc Hebert, chief operating officer at Estuate, has blogged about this growing trend. I had the opportunity to ask him a few questions and enhance my own understanding of the concept. Below is an excerpt from a conversation where Marc shared some of his insights on the data security landscape.

2014 was an important milestone in terms big data adoption across industries and use cases. What is the immediate impact on data governance and security practices?

Organizations are very focused on data security in their production and non-production databases today. We are seeing an explosion of data masking projects using IBM InfoSphere Optim Data Privacy and many of them are "factory"-type implementations, repeatable processes across the entire corporate application portfolio.

You talk about the “factory” model for masking data in your blog. Can you elaborate on the concept and explain the need to have a factory-like scale and efficiency to manage data?

Many organizations are adopting InfoSphere Optim as a corporate standard for masking sensitive data in all applications. They are using "factory" model repetitive processes executed in a centralized service for large numbers of applications, with consistent masking across applications. This model enables them to mask data in a large number of applications repeatedly and consistently.

You also talk about deploying data masking in non-production environments. Why is it important to secure the data in non-production environments?

Customers typically make a large number of copies of production databases for ongoing development, test, quality assurance, education and other purposes—it's not unusual for 10 or more copies of production data to exist in a typical IT organization for these various purposes. These copies are often accessible to outside vendors and non-authorized employees (DBAs, business analysts, IT developers), which exposes sensitive data unnecessarily. With so many data breaches in the news today, corporations are tightening the screws on these non-production copies to reduce the risk of data breach.

Do you see any differences in the way organizations across different regions are approaching big data security and governance?

Data security has very high visibility today in the developed countries of North America, Europe and Asia, but somewhat less visibility in the developing world. In the US, large corporations in industries like financial services, retail, healthcare and life sciences are leading the way and adopting data security as a corporate standard. Companies in less regulated industries, and companies that are mid-sized or smaller, are focusing more on point solutions so far.

Closely associated with the emergence of big data are the challenges associated with storing and managing very large volumes. What are the benefits organizations can expect when they opt for the factory model of application retirement?

The factory model, with its repeatable processes and standard operating procedures, is the state of the art today in large corporations for applying data management practices to large volumes of applications and their data—the big data approach, in effect. By adopting a platform like InfoSphere Optim for application retirement, the corporation can retire a large volume of applications and data in a standard way, in a common corporate archive repository, with standard reporting tools across all applications, and thereby manage large volumes of data proactively to the end of its useful life.

For further reading