Why This Big Data Topic Should Be On Your "Favorites" List

Program Director, Analytics Platform Marketing, IBM

When IT and business leaders talk about big data, information governance is not typically at the top of the list of conversation topics. Some practitioners dismiss it out of hand as impractical for enormous quantities of data. Some say governance of big data would cross the line between just-right and too much governance. Some say it sounds like a good idea but don’t know where to start. They typically decide to tackle it later.

Now research by Aberdeen Group (The Big Data Imperative: Why Information Governance Must Be Addressed Now, Aberdeen Group, December 2012) suggests that information governance for big data must be addressed right away. We all know that data is growing; Aberdeen sets the current growth rate as 56% year over year. Aberdeen goes on to say that “companies that are not taking measures to ensure business data is accurate, protected and easily managed at large scale, are losing millions of dollars in avoidable costs.”

On top of all those avoidable costs are avoidable business missteps caused by poor decisions based on unreliable, ungoverned data. In short, the cost of doing nothing about governance of big data is high. So what can be done? Fortunately, tools are readily available to address key elements of information governance, including:

Data Integration – As data volumes, data sources and complexity grow, the importance of tools with years of production experience can’t be overstated. Some have even referred to IBM InfoSphere offerings in this area as “the first big data tools.”

Data Quality – Lack of data quality can have a serious impact on any big data project—leading to unreliable results and lack of confidence in the data and the analysis.

Master Data Management – This technology is critical to any efforts focused on combining big data with more traditional data to create a 360° view of customers.

Data Lifecycle Management – One aspect particularly important to big data is the identification of data that can be safely moved from production systems to less costly storage media, with easy access still available.

Data Security – Who needs access to personally identifiable information? Who has accessed sensitive information this week? Data protection can not only keep data safe but also avoid breaches and the associated costs to reputation and the bottom line.

As an integral part of the IBM Big Data Platform, IBM’s InfoSphere Information Integration and Governance (IIG) capabilities provide the building blocks for trusted information for a big data environment. Building confidence in data and avoiding the high cost of doing nothing are both objectives worth pursuing.