Big Data Philosophies: More Practical than You Might Think

Big Data Evangelist, IBM

People often treat philosophy as an intellectual pastime for the unemployable. That’s absolutely not the case.

Philosophy is the most practical of disciplines. Essentially, it is an examination into basic principles, cultivating minds that can critically examine problems down to their very marrow. Philosophical inquiry is what any good analyst, engineer, developer or business person should do if they want to put their thinking and practices on solid conceptual footing. It enables you to better explain, defend and evolve your approach, both in your own mind and in the minds of others you hope to win over.

philosopher_busts.jpgI recently came across this article on OCDQ Blog in which Jim Harris called for “data philosophers.” I’m not sure that the world needs a new job category by this name, insofar as philosophy (aka critical thinking) should be a basic aptitude of all data professionals. Nevertheless, it would be useful to see what relevance the main branches of academic philosophy might have to the new world of big data. There’s more practical relevance than you might think.

Full disclosure: I was essentially a philosophy minor in college and took a wide range of courses. Here is my first take on the various philosophies relevant to big data (and relevant to all the things you can do with big data):

  • Philosophy of science: This is the crux of Harris’ discussion. True science is simply the human activity of building and testing interpretive frameworks through controlled observation of real-world phenomena. Essentially, science reveals a single, authoritative, empirically grounded version of the truth. And a data scientist—a key developer in big data—is simply a scientist whose focus is on building and testing statistical models of real-world phenomena, drawn from observational data.
  • Epistemology: This is often known as the “theory of knowledge,” concerning how we can truly know anything with certainty. Philosophers have long distinguished between “phenomenology” (knowing things as they appear) vs. “ontology” (knowing things as they are; not “ontology” in the semantic Web sense). In data management, a “single version of the truth” is not ontological but phenomenological: a portrait of things as they appear (to a consensus of stakeholder humans as processed through their information systems). Big data repositories simply produce richer portraits of observed phenomena, but can’t put us directly in touch with the most fundamental truths.
  • Aesthetics: This is the theory of beauty. In data management, beauty comes into the picture in several ways, quite often masquerading as simplicity, economy, efficiency and effectiveness. We gravitate toward Ockham’s Razor or the principles for normalizing relational databases because they are conceptually elegant. Data visualizations attract our eyes because they may be lovely to behold, irrespective of the story they tell. Data-driven narratives draw their power in part from the storytelling skills of the analyst who wields them. And even when the big data is an unholy mess, it can, if analyzed by sharp minds with precision tools, reveal deeper beauties in the world around us.
  • Logic: This is the study of modes of reasoning. Critical thinking—deductive, inductive, case-based, etc.—is the heart of big-data analytics. Whether you’re a spreadsheet jockey or the world’s foremost data scientist, you’re looking for correlations, causations and other patterns that come down to logic. If the patterns you’ve uncovered don’t fit into any logical narrative of what might be going on, it’s very likely you see spurious correlations. Big data is the raw material that we drill and vet for patterns that are not only statistically significant but logically plausible.
  • Ethics: This is an investigation into normative principles that guide human decisions. Of course, the core application of most business analytics is decision support. Many data applications focus on framing the normative “should/shouldn’t” choices that shape human action. These applications go by names such as performance management, prescriptive analytics, next best action, recommendation engines and the like, all of which allude to decision support, which is what business intelligence and big data are all about. The ethical framework for decision making often comes down to a wide range of situational factors that include, but are not constrained to, the options you present to users in their decision support applications.
  • Philosophy of mind: This is the theory of subjectivity. When data professionals talk about “customer experience management,” they’re making an implicit assumption that psychological states can be measured and optimized as if they were objective constructs. But the mind is innately subjective, private and inaccessible to external manipulation. As you develop the 720-degree view of the customer—external behavior and internal experience—always remember that you’re not truly reading their mind. People have free will and can defy your expectations at every step of their journey. That’s why you can’t truly engage with people unless you connect with them tête-à-tête through simple human conversation.
  • Political philosophy: This is an investigation into principles of human governance, jurisprudence and equitable distribution of society’s resources. The relevance of political principles to big data may seem remote, but the link is there. When used to support rational decision making, big data supports the egoistic “homo economicus” philosophy that’s been driving free-market economics since Adam Smith. People seldom realize that Smith considered himself a philosopher (he called his branch “political economy”). And when used to drive targeted marketing offers by customer segments, big data supports a political-economic principle called “Pareto efficiency.” In the offer-targeting context, the chief caveat on Pareto efficiency is that you must assume there’s no cross-segment envy (i.e., customers aren’t offended if they don’t get as good a deal as “the next guy”). Data-driven marketing professionals usually assume a customer experience that’s absolutely devoid of envy. That’s a philosophical assumption that may significantly distort what actual customers feel in their heart of hearts.

I doubt that Plato, Kant and Bertrand Russell lost much sleep dissecting these issues. Keep in mind that philosophy is a living human activity. It will languish inside dusty old books unless we adapt and apply it to contemporary problems.

Besides, none of those dudes had a whole lot to say about big data.

Related resources

You might also like this fanciful rendition of James' post on Debunking Myths about Data Scientists


By Matt Neale from UK (Greek philosophersUploaded by NotFromUtrecht) [CC-BY-2.0], via Wikimedia Commons