Big Data: A Boon For Governance Professionals

A strong big data governance program can take the guesswork out of finding and using the right information to make business decisions

Founder and Managing Partner, Information Asset, LLC

Several organizations are implementing information governance to oversee critical data about their customers, materials, vendors, and financials. By the same token, enterprises are implementing big data programs that take advantage of open source technologies like Apache Hadoop to derive value from new classes of data from sensors, RFID, social media, call center logs, and other sources. For the most part, information governance and big data programs exist in silos within organizations. However, I expect organizations to take a more holistic approach to these initiatives over the next 12 to 18 months, and I am coining the term “big data governance” to reflect this emerging trend. I define big data governance as the formulation of policy to optimize, secure, and leverage big data as an enterprise asset by aligning the objectives of multiple functions. I believe that information governance programs are just starting to broaden in scope to include big data. Here is a framework that lays out an expanded framework for information governance:

  1. Metadata governanceThis includes a consistent set of business definitions for critical terms as well as linkage to the associated technical artifacts.
  2. Master data governance This includes a single view of customers, materials, vendors, and employees and a chart of accounts. Each data domain has specific attributes that need to be fit for purpose. For example, phone number is an important attribute for the customer data domain because it is important for an enterprise to have valid contact information.
  3. Reference data governance This includes data that is relatively static such as codes for countries, states or provinces, currencies, industries, and customer segments.
  4. Big data governance These data tend to be operational in nature and meet the criteria of the three Vs—the data stretches the boundaries in terms of volume, velocity, and variety in terms of structured, semi-structured and unstructured formats. Big data literally span every function within the enterprise. “Every employee, be it the security guard or the CEO of the company, must understand that data is a corporate asset and treat it as such,” says Inderpal Bhandari, Chief Data Officer, Medco Health Solutions.

“Every employee, be it the security guard or the CEO of the company, must understand that data is a corporate asset and treat it as such,” says Inderpal Bhandari, chief data officer, Medco Health Solutions. Big data governance programs also need to focus on issues that are similar to other information governance initiatives. These programs must address the following:

  • Metadata. Big data governance needs to create sound metadata to avoid situations where, for example, a company buys the same dataset twice because it is named differently within two different repositories.
  • Privacy. Enterprises need to be very specific about adherence to privacy concerns, such as leveraging social media analytics.
  • Data quality. Organizations need to establish what level of data quality is “good enough” because of the high volume and velocity of big data.
  • Information lifecycle management. Big data governance programs need to establish archiving policies to ensure that storage costs do not spiral out of control. In addition, organizations need a retention schedule to facilitate the defensible disposition of data in compliance with legal requirements.
  • Stewardship. Finally, organizations need to recruit big data stewards. For example, the exploration and production groups within oil and gas companies have stewards responsible for seismic data including the associated metadata. These stewards need to avoid situations where the organization pays for external data that it already owns because of inconsistent naming conventions. In addition, the social media steward needs to work with legal counsel and senior management to establish policies relating to the acceptable use of this information.

“Data governance is becoming increasingly important to organizations as they realize the need for the business to take ownership over its vital data assets,” says Andy Hayler, CEO of The Information Difference. “IBM has long been very active in data governance, and this framework usefully extends the scope of governance initiatives beyond master data to the often-neglected (yet important) areas of reference data and corporate metadata, as well as explicitly recognizing the need for data governance to address the issue of big data.” Done correctly, a strong big data governance program can take the guesswork out of finding and using the right data to make business decisions—regardless of where the information comes from, what type it is, or how fast it is moving. This takes pressure off governance professionals as new data sources flow in and puts business leaders in a better position to explore and capitalize on new types of data for growth.

[followbutton username='IBMdatamag' count='false' lang='en' theme='light']