Must You Sacrifice Privacy for Big Data?

Product Director, Master Data Governance

Do we sacrifice privacy for the promise of big data? Finding a proper trade-off between the benefits of big data and requirements of privacy regulations is becoming a critical area of information governance.

Even though holistic analysis of multiple sources of information for decision making is extremely promising for businesses and government, such a comprehensive use of information potentially conflicts with information privacy for customers, citizens, patients, guests etc, depending on the industry under consideration.

In this blog we will briefly discuss technology evolution in the context of growing concerns over information privacy. This will help us better understand the history and draw important conclusions on good governance practices for big data.

In the closing decades of the nineteenth century, multiple technological inventions changed the way how the information was captured, stored and distributed to the public. Fairly inexpensive portable cameras, sound recording technologies became available. Photography had become a routine method of presenting information in newspapers and magazines. Bell invented the telephone and the telephone exchange opened in 1877. Radio was invented in the 1890s.

In 1890 Boston lawyers Samuel Warren and Louis Brandeis published an article, "The Right to Privacy," known as the first publication in American history on the topic. The article recognized that technology advances caused vulnerability to uncontrolled dissemination of personal information, individuals' words, images and actions without the individuals' consent. At that time privacy rights were primarily a concern for celebrities and the upper class.

With computers commercially available in the early 1970s, privacy issues become important for the general population. The US Privacy Act of 1974 addressed the new data volumes and methods of information storage, maintenance, processing, retrieval and distribution. The Privacy Act of 1974 differentiates between the term “system of records” and “statistical record.” The former stands for a group of any records stored and retrieved by the name of the individual or by some an identifier assigned to the individual. “Statistical record,” by contrast, can be used for research, reporting or training purposes only and cannot be used in whole or in part to make a determination about an identifiable individual. In order to convert a "system of records" data into the "statistical record," organizations often remove personal identifiers such as name, address, Social Security number, phone number etc from the "system of records" data to anonymize the data.

In the last two decades, with the advances of the Internet and social media, tremendous amounts of information about individuals became available on the net. It is a paramount challenge to intelligently match and join all of the information available on the net and connect the dots of information. This is exactly what big data promises. Nowadays the notion of privacy, more than any time in the past, is to be applied to the collection of sources that big data solutions can federate, mine, match, relate to de-anonymize the data to reconstruct personal sensitive information from a "statistical record."

The public can't even imagine what information about their personal lives can be gathered on the net if the power of big data and modern matching algorithms are unleashed. Hence the public should be more educated and vigilant about what information about them, their households, families and friends can available in public domain.

The legislators should review the definitions and terms such as "the system of records" and "statistical record" and make them better defined to address the realities of the big data, the power of mining, matching, entity and relationship resolution techniques. A recent development with SNOPA (the Social Networking Online Protection Act would prohibit institutions from requiring users to give up their log-in credentials on social media sites. SNOPA just scratches the surface of what is required to bring the privacy legislation in compliance with the big data age. Our society is on the brink of a new generation of privacy legislation. 

Corporations implementing big data solutions should be smart enough to understand the privacy issues, risks of non-compliance and embrace the concept of "privacy by design" -– a systematic approach that takes privacy into account from the start.

Information governance as a discipline is quickly evolving to address privacy needs and risks in the age of big data and should be an essential part of organizations’ big data strategies. 

To find out more about managing big data, join IBM for a free event: