Adapting information integration and governance for the era of big data
I participated in a really rich Twitter chat yesterday on the topic of "Evolving integration and governance for big data" with thought leaders Jim Harris (@ocdqblog), James Kobielus (@jameskobielus), Tim Crawford (@tcrawford) and Richard Lee (@InfoMgmtExec).
Sensitivity and error
The first topic we discussed was whether there will be a backlash to high-profile mistakes with big data and analytics, such as the recent OfficeMax issue when they sent a market mailing with the name “Mike Seay – Daughter Killer in Car Crash” in the envelope window. Everyone agreed there would be some backlash, both from mistakes such as these, as well as consumer reaction to the big brother feeling that some big data campaigns evoke (maybe that company knows a little too much about me). Potential new regulations will force companies to answer some fundamental questions, such as being accountable for knowing the origin of data and its intended use.
Everyone on the chat agreed that there would be some consumer backlash and an increase in regulations, therefore organizations must be agile in their approach to big data privacy and security. All Twitter chat participants were in accord that this should in no way slow down the adoption of big data. In fact, the ability to handle sensitive big data carefully can become a significant differentiator for an organization.
The second interesting discussion topic focused upon the issue of finding data and establishing a level of confidence in that data. Participants noted that it simply takes too long to gather data on each big data project—some suggested between 40 and 80 percent of the time on a project was consumed on that one task of aggregating the data. Integration technology could help reduce that number dramatically, with the aid of automated discovery and classification, and self-service data integration capabilities.
The issue of confidence brought a host of opinions. Some felt that confidence was low, or always in question by business users, while others felt that confidence was high until proven otherwise, or at least high in reports that users have used previously (there’s confidence in comfort). it was concluded that confidence, while subjective and difficult to quantify, is very important to adoption of big data and analytics. If users lack confidence in data, they will lack confidence in the results.
Big data and analytics evolution and adaptation
Another topic which sparked debate was how to repair all of these issues: can existing integration and governance technology evolve and adapt to new requirements, or does it need to be reimagined and reinvented? Everyone agreed that evolution was the desirable and logical path. Existing integration and governance technologies should adapt to big data scale and adopt a wider variety of data types. They should also evolve to include new big data technologies to address those broader requirements. The conclusion was clear: there is no need to reinvent the wheel when the core products are fundamentally sound and built for big data—simply evolve them for these new requirements.
These Twitter chats are hosted weekly under the hashtag #BigDataMgmt and I encourage you all to join the discussion. Also check out the latest blogs, videos, and infographics on information integration & governance at ibmbigdatahub.com.