Data Warehousing: The Brains of the Big Data Operation

Cognitive technology accelerates understanding data to meet the demands of the <em>now</em> business

Director of Offering Management, IBM Analytics, IBM

Data warehouses were built during the last two decades because we knew that data could make organizations smarter at running their businesses. And then reports were automated and provided information about the business that stakeholders wanted to know—things such as how sales are doing or which region is outperforming the others.

As business intelligence (BI) became widely used, organizations were soon capable of not just reviewing progress but also making some predictions. If customers buy one thing, for example, they may have a high propensity to buy another companion product, so why not run an offer? Or, if customers experience less-than-satisfactory service, there is a high likelihood of attrition, so what about offering retention?

The challenges for organizations evolved from the technology to accomplish these tasks to their ability to consume it operationally and their willingness to trust the data. The goals were simple: let the data speak to you, and leverage a single source of truth.

A new natural resource emerges

Fast-forward 15 years. The very point of view we fought so hard for is now rolling off everyone’s tongue. Data is the new natural resource—everything is big data. But now if the data speaks to you, all those voices may be overpowering because most of the data in the world today was collected in just the last few years.

In today’s highly instrumented world, people love data. They can’t live without it, which ushers in the next great hurdle: timely relevance. The issue today is not the ability to process an abundance of data; after all, the technology does exist. Instead, the next big challenge is helping client organizations find relevance in the tsunami of data when it’s needed.

Think about it. The problem is not a data problem; it’s an understanding problem. Big data—which makes this problem worse, not better—provides the opportunity to have more data to add depth, but it adds the burden of finding an understanding of what is relevant.

Some of my colleagues refer to what is relevant as real-time business, but the task is so much bigger than that. Yes, everyone wants their data in an instant, but the real value comes from accelerating a true understanding and insight at the time of engagement. Business clients want data in real time, and they absolutely demand it. But as the ocean of data becomes increasingly deeper, how do we provide a way through all the noise?

Cognitive computing takes a bow

Suppose you go to the doctor with some influenza-like symptoms. Your doctor has a library of medical journals and information available 24 hours a day, but probably has only three to five hours a week to read them. Consider as well that nearly 200,000 medical journals and articles are published annually. Nevertheless, your diagnosis is limited to the knowledge the doctor has at the time of your appointment. Sure, the doctor can search your symptoms, consult with colleagues, and be overwhelmed with lots of data and information. But some of that data is highly relevant, while some of it is not even close.

However, suppose the doctor has help in the form of a system that can think along with the doctor? And what if that system can find both timely and relevant information that leads to determining that those flu-like symptoms may be something else? In that case, you are no longer limited to that moment-in-time experience, and you truly have the benefit of big data. If such a system could work alongside the doctor, he or she could leverage every bit of data available to help provide a more accurate diagnosis and treatment plan than would be possible with just the traditional sources.

This example represents a crossroads among big data, data warehousing, BI, and cognitive computing. Data is moving from keeping us informed to really changing the world for the better, which is a very exciting place to be. The essence of building data warehouses—the capability to leverage data in making decisions—has not changed, but it does have to evolve. Through this evolution, the time has arrived to use data to help profoundly impact change.

IBM Watson™ offers the next generation of cognitive computing that helps bring understanding to big data. And data warehousing and big data are headed in that direction. IBM believes that innovation is delivered in technology, and innovation occurs when technology changes things. By gaining a better understanding of information, significant change can impact everything.

An advanced business unit springs forth

Recently, IBM announced the formation of a new business unit called Watson that focuses on cognitive computing. With this strategic move, the data warehousing and big data platforms are now part of IBM Watson Foundations.

Many people have asked me about the connection between these two key initiatives, and my answer is cognitive—the ability to understand the data is what’s next for the data warehouse and big data platforms. As a result, a shift is underway in which we’re moving from the suppliers of data to enablers of understanding it when needed. IBM is investing in this space and building an ecosystem around this capability that also includes all the functions available today on the big data platform. And now the focus is on increasing understanding as data grows and big data surges.

Putting a modernization strategy into action

Organizations that want to get ready for these initiatives must start with a modernization strategy for their big data and warehouse architecture. Modernization is not about ripping and replacing infrastructure. It is about investing in solutions that make sense for businesses and finding the right starting point.

Over the last few years IBM has been writing about its point of view around the warehouse architecture and its evolution in the big data era. It recently released the Zone Architecture, its logical data warehouse that breaks down the big data platform into areas or zones of capabilities. A modernization strategy should always start with a business-value proposition. Many data warehouses today suffer from a lack of agility and simplicity. Keep the following key points in mind when considering a modernization strategy:

  • Focus on speed to value. Can the architecture help answer questions the organization could not answer before, or can it answer them more quickly than previously possible? Can advanced technology such as in-memory data help speed the processing of information? In-memory data helps answer questions faster than disk-based data storage, and when the processing is accelerated, more questions can be asked and more capability can be added than ever before.
  • Focus on delivery and agility. Keeping up with demand has been a killer for many data warehouses. Complex structures are built that do not allow organizations to keep up with business demand. The terms agility and simplicity were not used when designing data warehouses in the prior era, but they are critical today. Many businesses cannot wait, and organizations no longer live the real-time business; they live the now business. Leverage data warehouse appliances to help deliver capabilities that let you focus on the value of the business and not the integration. And look to the cloud as another way to deliver these capabilities.
  • Leverage new data types to enhance existing analytics. The Apache Hadoop framework can complement data warehouses by adding the capability to store unstructured data or become the landing zone for all data to be used downstream in the enterprise.
  • Combine technologies to gain value. The now businesses can’t wait for data to land someplace before a discovery is made. For example, a telecommunications organization needs to look for outage patterns across networks while the data is in flight, and then extract the outliers for further analysis to prevent outages later. By combining data-in-motion analysis with data-at-rest analysis, organizations can gain true operational intelligence.

The world of data warehousing continues to evolve. Beginning a modernization strategy with incremental investments for iterative delivery of new value and capability is important. Forget architecting for real-time business; the now business is here, and operational intelligence is critical.

The next step is to leverage cognitive computing to accelerate understanding and provide relevance to the business touch points in a timely fashion. I look forward to the day when we just ask data warehouses a question—that capability will truly be no SQL. Please share your thoughts or questions in the comments.

[followbutton username='nancykoppdw' count='false' lang='en' theme='light']
[followbutton username='IBMdatamag' count='false' lang='en' theme='light']