What's keeping data science from playing a more central role in public policy?

Big Data Evangelist, IBM

Long ago in a faraway galaxy, I had a college internship at an urban coalition. My primary task in the summer of 1979 was to estimate how much money the City of Detroit stood to lose in foregone federal entitlement grants if its citizenry were undercounted in the 1980 US census.

My immediate boss at that coalition didn't really care how I arrived at my estimate, as long as it was defensible and was packaged for easy consumption by Detroit's congress people in the US House of Representatives. Long story short: I reported an estimate of $50 million, which was picked up immediately by Representative John Conyers (still serving, 35 years later), who blurted it out in an important committee hearing on the Hill, which bent the ears of some reporters, who reported it in the local news the next day. And that's what I did on my summer vacation!

Detroit abandoned railway station on fort street 154285787.jpgI didn't use anything even remotely resembling data science to produce my estimate—just extensive data gathering, dogged vetting and simple arithmetic. But if I had in fact used fancy statistical analysis and predictive modeling to produce the exact same deliverable, it wouldn't have mattered one bit to the elected officials trying to make a specific decision, such as whether to include specific guidance on the matter to the Census Bureau in a forthcoming appropriation bill. You actually don't want to see close-up how our laws are made. And if you're a data scientist who expects to wow public policymakers with your sophisticated tools and techniques, you're in for a rude awakening.

It's no big secret that gut feel oils (or gums up) the decision-making wheels of government at every level. To the extent that data-scientific methods have input into the policymaking process, it's mostly aspirational, as I discussed in this recent post.

You can be cynical about this or realistic: data science, by itself, is an ineffectual governance tool if it lacks strong champions who can wield it to get things done in the legislative, executive and judicial branches. Decision science is just as important as data science: being able to identify the myriad factors that drive policymakers, and to use this understanding to identify where data-driven methods might have some potential sway. One species of decision scientist, the political scientist, spend their careers dissecting these factors in diverse policy arenas.

What will it take to increase the presence of data science (data-driven, evidence-driven or computational methods) in public policymaking processes? As I stated here, big data analytics can hold sway if it helps frame a compelling case in the minds of decision makers for taking this or that action. And if data science can show that a counterintuitive scenario is more valid than common sense or gut feel on a particular decision, it just may change the terms of debate in its favor.

I took special interest in this recent article on how big data has "some big problems" when it comes to influencing public policy. Author Derrick Harris states that "most of these obstacles have little to do with the data itself. It’s easier to gather and easier to analyze than ever before. Rather, the problem is that data scientists and researchers—even those who really care about tackling important issues—can often have a difficult time overcoming the much more powerful forces fighting against them."

Data scientists' chief adversaries in public policy arenas aren't specific individuals, interest groups or political parties, says Harris. Instead, the headwind they face is from impersonal forces, which I would loosely paraphrase as follows:

  • Unfamiliarity. It's easier for politicians to cozy up to the gut feel devil they know rather than the data science devil they don't. Politicians, like many people, may stigmatize data-centric decision-support practices (such as data mining and social sentiment monitoring) out of sheer unfamiliarity. More to the point, policymakers may reactively respond to popular hysteria about otherwise benign data science-centric practices (such as Facebook's real world experiments in so-called "mood manipulation") in order to cover their rear ends.
  • Indifference. Even if politicians were to get savvy and enthused about data science, their constituents probably won't. This means that the political class will always have to ratchet the data science content of their messages down to those choice "sound bytes" that win elections. As Harris states, "rather than talking about the myriad studies or the new types of data we could gather to find out even more, politicians often fall back on ideological arguments meant to appease voters and campaign contributors."
  • Complexity. Data scientists specialize in painting complex statistical scenarios, which are often hard to crisp up into simple arguments that sway legislators' minds and sell to the voters. Another thing to consider is that complex data-driven scenarios may not provide clear guidance for writing the general rules that become law.

But I'm not entirely down on data science's role in public policy. It may prove useful in spelling out the broad historical trends and likely future scenarios that the laws should address. And it may prove useful in monitoring downstream impacts of the laws and of the administrative actions that implement them.