Openness is where the world is headed. It’s the core principle in truly agile governance of a dynamic, complex, smarter planet.
Openness means many things to many people. From a big-data perspective, we can break down the key dimensions of openness as follows:
- Open platform and ecosystems: Open-source initiatives are transforming all big-data platforms. Some have embraced the movement toward open hardware platform designs, but they’re still a fringe minority. In software markets, open-source has taken root pervasively. As the enterprise Hadoop market continues to mature and many companies deploy their clusters for the most demanding analytical challenges, data scientists will begin to migrate toward this new, open source-centric platform. Similarly, OpenStack and Cloud Foundry are exerting an equally pervasive role in opening up enterprise cloud platforms. In the process, these and other open platforms are fostering open ecosystems of cloud services providers and independent software vendors in vibrant new marketplaces to serve diverse niches.
- Open languages, tools and APIs: Enterprise adoption of the open-source MapReduce, Pig and R analytic-modeling languages will continue to grow. Likewise, the industry will surely converge around a SQL superset that supports agile queries that span the full range of established databases and the newer big-data platforms. Open languages, tools and APIs for all big-data requirements are emerging. The adoption of open standards—both those that spring from open-source communities and those catalyzed through industry standards groups—will accelerate as enterprises demand a common interoperability framework to unify disparate investments.
- Open expertise: Just as platforms, languages and tools are opening up, Big Data’s development ecosystem is as well. Big data will leverage the most open arena of all, “crowdsourcing” cloud approaches such as Kaggle and TopCoder, to pool the world’s expertise (or at least that of all the smart people in your company and/or value chain) in wide-ranging development, investigation and exploration of analytics- and data-infused business problems from all conceivable angles.
- Open data: This is the most radical form of openness, and, in some circles, the most scary. It seems to conflict with intellectual property rights in information, hence with a core monetizable asset of knowledge workers everywhere. But the push—both in the global culture and even among many in the public and private sectors—is to loosen controls on access, use and republishing of data without restrictions from copyrights and other legal and technical restraints. It’s not just a libertarian preoccupation but also a focus of recent government initiatives such as Data.gov, Data.gov.uk, and the EU’s DOPA initiative; the latter is discussed in this recent article. What’s most interesting about DOPA are its aims: to facilitate open-data semantic transparency, within so-called “data supply chains,” by spurring commercial development of technologies for automated dataset detection, curation, entity linkage and advanced visualization.
In addition to these layers of technical openness, we should also address the term’s ideological connotations. These revolve around the principle of transparent oversight of public and private institutions in a democratic society. In a recent Stanford Law Review article, Neil M. Richards & Jonathan H. King raise a concern about big data that they term the “transparency paradox”:
“Big data promises to use this data to make the world more transparent, but its collection is invisible, and its tools and techniques are opaque, shrouded by layers of physical, legal, and technical privacy by design....[W]hen big data analytics are increasingly being used to make decisions about individual people, those people have a right to know on what basis those decisions are made. Danielle Citron’s call for “Technological Due Process” is particularly important in the big data context, and it should apply to both government and corporate decisions.”
You don’t need to read too deeply into recent international news stories to realize that many people share these concerns. It echoes long-running popular unease with a “technocratic” philosophy that some have dubbed the “dictatorship of data,” (and which I discussed in this recent blog).
Clearly, we need to be vigilant lest big data become an obstacle—deliberate or inadvertent—to achieving greater openness, transparency and agility in all spheres of human civilization.
Perhaps we should add “open governance” to that list of dimensions. What do you think?
Catch the "Big Data Bytes" videochat on "How Open Source is Changing Business," Friday, Sept. 20, 2:00pm ET, ibm.co/BDBytes