Bolstering Big Data Protection for Confident Decisions

Create and enforce security and privacy policies for<br>ironclad protection of big data

Manager of Portfolio Strategy, IBM

There is no turning back. Having been thrust into the era of big data, many organizations are scrambling for enhanced ways to harness its power in real time. But as new big data opportunities emerge, ensuring the veracity and security of information becomes increasingly more challenging than ever before. Rising numbers of data sources exist each day, and a wide variety of data types are speeding into and out of the enterprise at unprecedented rates. In addition, data volumes are growing exponentially—2.5 quintillion bytes of data are created every day.1

As a result, standard data protection strategies such as firewalls and identity and access management systems might not be enough or might not fully scale to match the growing data complexity. Data protection must be agile, responsive to threats, and close to the source because data is in constant motion. At the same time, data protection should also satisfy regulatory requirements. As numerous international and legal mandates continue to rise, one universal challenge for organizations is demonstrating compliance to a third-party auditor.

During a recent Information Integration and Governance forum hosted by IBM, an employee at one large financial institution said in three months the company would be going live with a new Apache Hadoop-based application. The employee’s biggest fear was that an auditor would shut them down because the organization lacked a secure audit that identified who was accessing sensitive data and when it was being accessed.

The Information Integration and Governance forums give organizations an opportunity to meet with senior technical and business leaders at IBM to formulate or refine a governance strategy for their big data initiatives. Data security and privacy concerns are usually top of mind, and organizations can consider some key areas and highly critical elements of a data protection approach in the era of big data.

Integrating security from the ground up

As organizations plan big data projects, they need to recognize that data security cannot simply be added later—it must be built into big data implementations from the start. If there is a data security breach, legal bills and remediating costs can be significant. In its “2013 Cost of Data Breach Study: Global Analysis” report, a Ponemon Institute study of 2012 data showed that on average United States companies participating in the study experienced a USD5.4 million cost for data security breaches.2 In addition, IBM XForce reports a nearly 40 percent rise in documented security breach incidents in 2012.3

Big data implementations often involve sharing data and placing all types of data into a central repository. As the analytics are performed, information is then dispersed through reports and documents. If security is not incorporated into the analytics platform, the opportunity to protect data is lost. Data protection policies should be business-driven and available on demand to respond to threats in real time.

During the Information Integration and Governance forums, one question often comes up: How do you know which data needs to be protected because sensitive data is everywhere? It is in production and nonproduction systems, reports and analytics systems, files, documents, applications, and a wide range of other places.

All enterprise data stores and the data inside those systems need to be evaluated (see figure). Some may be deemed a low risk, and other types may be considered a high risk. The first step down the path of big data security is to discover and understand sensitive data across the enterprise. Automated solutions exist to crawl through networks to identify data sources and then examine the data inside. Sensitive data could be business information, personally identifiable information, payment card data, and more.

Bolstering big data protection for confident decisions

Figure: Help secure data wherever it resides, including test, production, and cloud environments


Taking steps to secure big data

Once the sensitive information is discovered, IBM recommends the following five steps to support big data security requirements:

  1. Protect data as it moves into and out of big data environments. As data is being aggregated, shared, and turned into actionable information and insight, how can an organization be sure its data won’t fall into the wrong hands or be accessed by someone without a valid business purpose? Data masking on demand can de-identify data while keeping its content and meaning. This approach exhibits intelligent data privacy by protecting the data without breaking the analytics.
  2. Integrate security controls into big data platforms. Data security and privacy policies should not be process- or technology-heavy. They should align with business priorities and seamlessly integrate with enterprise systems. For example, automate the classification of Payment Card Industry Data Security Standard (PCI DSS) and automatically mask or encrypt the data as it moves across the enterprise. In addition, seamlessly monitor and audit data access as new data sources enter the environment without changes to networks, databases or applications, and automatically turn off controls when data sources are decommissioned.
  3. Leverage existing technologies to control and protect big data. Don’t disregard best practices. Work to extend data masking, data encryption, and data monitoring and auditing across the enterprise to emerging platforms.
  4. Build consensus about what constitutes sensitive data. No data security and privacy strategy can be successful without agreement from a cross-functional team about what constitutes sensitive data. For example, what information constitutes a corporate secret? Agreement across lines of business, legal staff, security, and IT is important to establish the right security and privacy strategy.
  5. Automatically control access to big data resources, and monitor user behavior. New data sources are added, dropped, and expanded every day. The IT environment is always changing, and a real-time monitor is required and must keep pace. Understand the who, what, when, and how of big data transactions. A complete access history helps understand data and application access patterns, prevent data leakage, enforce data change controls, and respond to suspicious activity. A secure, centralized repository containing a fine-grained audit trail of all data activities across systems is required. Also build workflows to automate compliance reports on a scheduled basis, distribute them to oversight teams for electronic sign-offs and escalation, and document the results of remediation activities without the need to enable native audit functions.

Applying these five steps helps ensure data-access and change-control capabilities, real-time data monitoring and auditing, and data protection and loss prevention. It also enables vulnerability management and sensitive data discovery, allows classification capabilities to support compliance requirements, and prevents data security breaches. Any big data security initiative should aim to achieve the following primary goals:

  • Prevent data breaches. Avoid disclosing or leaking sensitive data.
  • Help ensure data integrity. Prevent unauthorized changes to data, data structures, configuration files, and logs.
  • Minimize the cost of compliance. Automate and centralize controls, and help simplify the audit review process.
  • Protect privacy. Prevent disclosure of sensitive information by masking or de-identifying data on demand across the enterprise in databases, applications, and reports.

What compliance mandates are you struggling with? What is working in your data protection strategy? Where do you have gaps? Do these five steps help guide your strategy? Please share your thoughts and questions in the comments.

1 Big Data at the Speed of Business, What Is Big Data?,
2 “2013 Cost of Data Breach Study: Global Analysis,” Ponemon Institute research report, benchmark research sponsored by Symantec and independently conducted by Ponemon Institute LLC, May 2013.
3IBM XForce 2012 Trend and Risk Report,” IBM Security Systems, March 2013.


Additional resources