No security? No data? Big problem!

Director, Product Marketing, InfoSphere, IBM

Protecting and security sensitive big data is necessary to ensure data is shared for new forms of analysis. Before the owners of that data will share it (yes, political silos still exist, and yes individuals still feel they own data and can say no to sharing it), they want to ensure it is adequately protected – especially if they are the ones in the cross-hairs if that data is misused.

At the recent Data Governance Financial Services Conference in New York, I spoke on the issue of confidence in big data. And boy, did that topic ever resonate with the audience. I spoke with a chief data officer who said confidence was really the main issue she deals with – governance is all about confidently ensuring that her business users trust and protect their information. A head of governance approached me to discuss confidence in customer data; they were struggling to ensure they were confident in accurately identifying customers and households as the basis for big data analytics. There were a lot of common themes that came out of my discussions – customer data and big data, rapid integration of new data and business user self-service, how to visually display data confidence to business audiences …. but one issue dominated the conversations – privacy and security.

Ensuring privacy and security for big data, or any data for that matter, is always a top concern. Why? Well, someone might go to jail if sensitive data is exposed. Or face compliance fines. That’s always a compelling reason to act. But I heard something different at this conference. One chief data officer described it this way – “Imagine you want to buy a new car and safety and security is your top concern. 10 years ago you could always decide to add a security device or alarm after you buy the car. But now, you want a system integrated with the ignition. And for safety you want front and side curtain airbags – you’re never going to install those after the fact. So the issue becomes a non-starter – you’ll only buy a car with the features already integrated. The same thing is happening at our firm. Security is a pre-requisite for big data. If we can ensure data security for sensitive information, that project will be approved over one that lacks security. It’s a non-starter for big data and analytics – no security, no data.”

That certainly makes sense. Data security is as fundamental to sharing big data for new analysis as policing is to a healthy and thriving society and economy – it’s a fundamental pre-requisite. And it offers an interesting twist on the reason to worry about privacy and security. If you want to share big data freely, combine it in new an interesting ways in new technologies such as Hadoop or NoSQL, then you need to ensure it is protected. Big data is by definition sensitive data – it’s important information about your customers, your products, your suppliers. That data must be masked when it’s appropriate to do so (good rule of thumb – if the actual data value isn’t relevant for the analysis, mask it). It must be monitored to ensure that internal users aren’t accessing it inappropriately.

Before embarking on a new big data and analytics project, make sure you’ve taken care of the fundamentals. Make sure you can adequately protect and secure sensitive data before you ask a data owner to share it.

For tips on how to protect and secure big data, check out this ebook – Top Tips for Securing Big Data Environments.