Discrimination drives the need for ethics in big data
Big data and analytics are profoundly affecting the world around us. One of the focal points of my postings has been how big data and analytics affects, specifically, our personal privacy. An old and perhaps far too familiar twist on this has risen to the forefront of discussion and that is the issue of whether big data and analytics will be used to discriminate against the less fortunate (or perhaps even “the one percent”).
Over the past year the White House’s Office of Science & Technology Policy (OSTP) and the Federal Trade Commission (FTC) have held public debates on big data and privacy as there has been a rise in discord around the subject of discrimination. By way of background, many of our current data use policies come from the Fair Credit Reporting Act (FCRA) legislation of 2012 (a revision of the CCRA from the 90s) which outlines how data can be used in specific circumstance without any inherent biases.
The use of data to discriminate against individuals (and the organizations they belong to) based upon race, gender, religious affiliation, sexual orientation, location and more has a long standing history in the US, going back to 19th and 20th centuries when census and other data was used for illegal purposes by both the government itself and many commercial actors. Restrictions were placed on these use cases by regulatory authorities and written into law once public outcry arose.
These discriminatory controls were, for the most part, effective until we entered “the digital age” (for example, the digitization of paper) when basic data about consumers was generated, persisted and widely disseminated across both government and commerce. Once again, abuses began to surface and new controls were enacted in the 70s through the 90s (Fair Credit Reporting Act and The Privacy Act), but these were very narrowly defined and limited in extensibility.
Fast forward to current times where we now find ourselves in the era of big data and analytics in which richly detailed information about every aspect and activity undertaken in our lives is streamed, persisted, distributed and analyzed in near real time within an environment that shuns any form of control as either a threat to national security or an inhibitor of innovation. This has set the hair aflame of many who lived through the abuses of the past, fearing even more in the future. As a result, the White House OSTP and the FTC have set up workshops and discussions to get ahead of this challenge as well as the potential new wave of public outcry before regulatory backlash and potential limiting of the potential benefits of big data and analytics.
Based on my earlier postings on the subjects of trust, data brokers and social experimentation, it should come as no surprise that I see discrimination as yet another major challenge to the unabated growth of the big data market opportunity. We know that the users (and brokers) of all this rich data and its analysis have strong desires to use them to provide much more insightful (and accurate) answers to many risk and opportunity-related questions; that is only normal given the limitations of the past and the high level of inaccuracies that many data-enabled decisions suffered. However, we must imagine that more nefarious use cases will be equally zealous in their use, and potential abuse, of the power of big data and analytics. This is the primary motivation driving everyone involved to endeavor to create a “do no harm” doctrine around these capabilities (for example, a big data code of ethics).
To this end, I would suggest creating a code of ethics for secondary data use based on a meaningful and detailed set of acceptable vs. unacceptable use cases monitored by a board of “arms length arbiters.” Representation would be from across government, academia, commerce and human rights organizations, all of whom have a vested interest in a harmonized outcome. To minimize bureaucracy, a system of guidelines and boundaries could be developed, along with communications, enforcement and remediation activities. This body, however, would not be responsible for privacy, as this must be written into formal policy and enforced accordingly.
There are no simple fixes to either the privacy or consumer rights issues facing the big data and analytics community at large, but it is clear to me that simply tweaking or rewriting old laws, regulations and policies will not suffice. We need a holistic approach that can serve the needs of all parties regardless of location.
For more information, check out the Freedom of Information Act and Privacy Act Facts