Data Security Strategies to Keep the Bad Guys at Bay and the Good Guys Honest

Manager of Portfolio Strategy, IBM

You rely on the data warehouse for business-critical reporting and analysis. The warehouse has become the workforce engine powering business decisions and driving results. It is a central repository created by integrating data from one or more disparate sources such as applications, databases and legacy systems, a task becoming increasingly challenging with 2.5 quintillion bytes of data being created each day. Data warehouses store current as well as historical data and are used to satisfy business queries of all kinds.

Analytics without security is an accident waiting to happen - What’s at stake?

As you evaluate your data warehouse strategy, do not neglect data security or your reputation is devastated. With the average cost of security-related incidents in the era of big data estimated to be over USD40 million, you cannot afford to ignore data security as a top requirement. Not only do data breaches cost money, they can also negatively impact an organization’s brand, driving down stock prices and causing irreparable brand damage. In fact, a survey titled "Business case for data protection" asked the question “what are organizations’ goals that are dependent upon good data protection efforts.” In the past few years, preserving customer trust has moved into one of the top reasons listed. More and more customers are expecting companies to safeguard their information. The cost of losing customer’s trust could be devastating.

The black market for sensitive data is hot. Currently, cybercrime is a USD388 billion business—USD100 billion more than the black market for the world’s illegal drugs. Identity theft is at a three-year high, affecting 12.6 million US consumers and costing more than USD21 billion each year. Ninety percent of sensitive data theft is from servers. Why? This is because most sensitive data are stored in servers. As the famous bank robber Willie Sutton once said “Why do I rob banks? Because that is where the money is!”

As you aggregate data into the warehouse, ask these questions:

  • Could some of this data be sensitive?
  • Do you have controls over whom or what is accessing this data?
  • Do you know which data could fall under the scrutiny of an auditor?
  • Have any of your competitive peers been hit with a data breach or large audit fine?
  • What is your strategy for managing new risks and threats to the warehouse?

Now is the time to understand sensitive data and establish business-driven protection policies to keep data safe. You will want to continuously monitor and audit data activity as well as de-indentify, through either making or encryption, the data as it moves into or out of the data warehouse. In addition, policies need to keep up with the velocity of data—even one minute is too late.

Suburban_tract_house.JPGThink of your data warehouse as your home, a place to keep your most valued and treasured possessions. Your home is equipped with locks, an alert system and safe. Perhaps you are part of a neighborhood crime watch. Would you feel comfortable with anything less? The wide variety of data speeding through your enterprise and moving into and out of the data warehouse requires the same level of protection.

The bottom line–the increasing number of analytics systems storing sensitive data exponentially increases the risk of a breach–more data stores means far greater risk.

What needs to be protected in the data warehouse environment?

A comprehensive and agile security strategy is required for the data warehouse environment. A data warehouse environment consists of much more than just a database. The entire environment ranges from the extraction of data from operational system, transportation of this data to the data warehouse, distribution to other analytic platforms, and finally distribution to the end business user. In today’s highly distributed, complex world, the environment spans multiple servers, applications and systems. When putting a security strategy in place, ask yourself “Who has a valid business need to know sensitive data?” If no valid reason exists, then access should be denied. Also ask, should sensitive data even be entered into the data warehouse at all. In many cases, you can analyze trends at the aggregate level without compromising sensitive personal detailed data that could break PHI/PII rules.

Some questions to ponder across three key areas of focus for data warehouse security.

Data inputs

  • As data moves into the warehouse, how can you ensure integrity?
  • Is a data classification policy in place? How is it applied to data entering the data warehouse?
  • Does this data need to live in the data warehouse at this level (or aggregated)?

Data outputs

  • Is data exported from the warehouse to other applications, for example for reporting? If so, how is the data secured?
  • How do you know that only authorized recipients are able to obtain the output?
  • How do you know the right recipients receive the right information – and nothing more?

System security

  • Have user access rights been determined and documented?
  • Are they based on roles, for example through Active Directory groups?
  • Are administrator and super-user accounts carefully controlled and audited?
  • Is the supporting database appropriately configured and hardened for maximum security?
  • Is access to data restricted according to its sensitivity?
  • Do you sufficiently monitor and audit the data warehouse?

How can you address these issues?


The capabilities outlined in the figure above are available on all leading data warehouse platforms such as the IBM PureData System. Most leading data warehouse platforms come with built-in identity and access management systems.

Additional recommended resources



Photo: BrendelSignature