Part I: Key Considerations for the Analytic Enterprise

CTO, Digital Media and General Business, Netezza, an IBM Company

Today’s organizations understand the value of analytics. Recent studies, such as the joint IBM/MIT Sloan Management Review study, The New Intelligent Enterprise, have shown that strong positive correlations exist between market leadership and an enterprise’s analytics IQ.

Indeed, it’s fair to say that much of the attention around big data is associated with companies that have a vested interest in “upping” their analytics IQ to differentiate them in the marketplace. So much so, that many organizations now believe that gleaning deep analytical insights from data has become a survival imperative in a global economy.

The New Era of Analytics

But what happens when an increasing number of organizations embrace analytics? Do they lose their competitive edge because they are all competing with increased insight? For example, over a decade ago, the Oakland Athletics embraced analytics and data-driven approaches to player valuation—they used this approach to gain advantage over other teams in Major League Baseball (MLB). Indeed, “The A’s” were leaders among their peers, demonstrating exceptional payroll to performance and net revenue per seat indicators. They set themselves apart from their industry peers using analytics. In fact, their story is so compelling and generated so much interest that it was told through the best-selling novel and blockbuster movie hit Moneyball, starring Brad Pitt.

This post is Part I of a four part series on IBM’s big data platform. In this part I will describe the key analytical capabilities that an enterprise needs to put in place to stay competitive in today’s big data world. In subsequent parts we will look at how IBM translates these considerations into its big data platform components and technology solution stack.

This series is an abridged excerpt from our new book, Harness the Power of Big Data.

Download a free copy of the e-book
Harness the Power of Big Data

Eventually, other teams caught up. Today, there isn’t a single team in MLB that doesn’t use sophisticated analytics. Widespread adoption of analytics within MLB has neutralized any advantage that the Oakland A’s once had in this domain.

It’s fair to say that we’re now entering an era in which analytics will be considered to be a “table stakes” capability for most organizations and the differentiation really lies in how they push the envelope with respect to their usage of analytics. Let’s look at a specific example of how an organization is taking the use of analytics to a whole new level.

Using analytics to measure customer support effectiveness is a common practice among many customer-centric organizations. It enables them to monitor customer satisfaction, drive retention, and manage the cost of support. The traditional approach involves analyzing all of the data associated with a support incident, such as call duration and speed of resolution, and then identifying opportunities for improvement. It also involves conducting surveys and collecting satisfaction metrics. One such metric, the net promoter score, can be a very effective tool in gauging a customer’s perception of the company or product based on their interactions with support personnel.

Although this traditional approach can be effective in improving customer satisfaction and reducing churn, the analytical cycle time (the time interval between a support call and the actual process improvements that get pushed to the front line) can be quite long. During that time, other customers might have similarly poor support experiences that could cause them to churn. The opportunity for organizations to differentiate and compete revolves around not only the use of deep analytics at the core of their business, but also the analytical cycle time. Like a giant snowball rolling down a hill, the impact of analytics on your business is slow at first, but with every rotation, the potential impact becomes greater and greater.

With this in mind, the question really becomes, “Is it possible to take the analytical models and processes that have been built on historical data sets and apply them in real time to streaming data?

One of our clients is currently in the process of doing exactly that. They have an intelligent intercept agent that monitors all telephone conversations between customers and customer support representatives (CSRs). This agent monitors the conversation, applies sentiment analysis to that conversation, and provides recommendations to the CSR in real time. For example, if a customer uses tonal inflection to ask a question, or uses sarcasm to express displeasure, the automated agent is able to detect that immediately and provide specific guidance to the CSR. The advice could be to answer the question differently, escalate the call to the next level, provide specific incentives to the customer, or simply to be more polite.

By intercepting, monitoring, and analyzing such calls in real time, this client is able to vastly improve support effectiveness by taking immediate remedial action to improve customer satisfaction. These capabilities don’t replace traditional offline analytics; rather, they augment them by incorporating new varieties of data (voice in this case) and performing analytics in real time.

Key Considerations for the Analytic Enterprise

Differentiating on analytics means using data-driven insights to enhance organizational strategies and using analytics in ways that were previously impossible. Limitations that used to restrict where and how organizations could run analytics are now being eliminated. Moreover, some of today’s most “analytic enterprises” are changing their analytic deployment models to gain competitive advantage. The following are some examples of these new analytical deployment models.

1. Run Analytics Against Larger Data Sets

Historically, performing analytics on large data sets has been a very cumbersome process. As a result, organizations resorted to running their analytics on a sampled subset of available data. Although the models that they built and the predictions that they generated were good enough, they felt that using more data would improve their results. They recognized that the sampling process could sometimes lead to errors or biased conclusions.

Organizations that can run analytics, at scale, against their entire data sets, definitely have an advantage over those that do not. Pacific Northwest National Lab’s Smart Grid Demonstration project is a great example of this. The project hopes to spur a vibrant new smart grid industry and a more cost-effective reliable electricity supply, both of which are drivers of US economic growth and international competitiveness. They plan to collect large amounts of data— specifically event data from 60,000 metered customers across five states—and run complex analytical models on them to predict and prevent power outages.

2. Run Analytics Blazingly Fast

Running analytics is a multistep process. It involves data exploration, data cleansing and transformation, creating analytical models, deploying and scoring those models, publishing outcomes, and then refining the models. It’s also an iterative process. If the underlying analytical system performs poorly while running analytical queries, this adds latency to the overall process. Hence, having the ability to run complex analytics on large data sets extremely quickly has distinct advantages.

First, it enables greater business agility and dramatically cuts down on the overall decision-making time. A stock exchange client of ours, NYSE Euronext, cut the time to run deep trading analytics on 2PB of data from 26 hours to 2 minutes! They were running deep analytic queries that required significant data access and computation. This boost in performance not only helped them react to market changes faster, but it also enabled them to increase the complexity of their analytical models.

Second, it improves analyst productivity. Another client, Catalina Marketing—a large retail marketing services provider—was able to increase their analysts’ productivity by a factor of six. They did this by reducing their average analytical model scoring time from 4.5 hours to 60 seconds. As a result, they were able to run more analytical models, using the same staff, and deliver deeper, transformative business insights.

3. Run Analytics in Real-Time

Having the ability to run analytics in real time, as events occur, has a profound impact on how an organization reacts to changes. For example, University of Ontario Institute of Technology employs analytics on data in motion with the goal of predicting the onset of potentially life-threatening conditions up to 24 hours earlier, which can make a huge difference in patient outcomes. They do this by analyzing streamed data from various monitors and vitals indicators. A telecommunications client of ours is able to analyze streaming network traffic data in real time to detect bottlenecks, and is able to apply preventive maintenance to their network. By moving analytics “closer to the action,” organizations create tremendous opportunities for differentiation.

4. Run Analytics on a Broader Variety of Data

Earlier in this post, we described a client who achieved enormous customer support effectiveness by analyzing voice data from support conversations in real time. The ability to incorporate newer varieties of data, such as voice, text, video, and other unstructured data types, along with structured relational sources, opens up possibilities for improving efficiencies and differentiating competitively. One of our retail clients is now correlating social media data with their point-of-sale data in their data warehouse. Before launching a new brand, they know what type of buzz it is generating, and they use that information to forecast product sales by geography and to ensure that merchandise is stocked to that level. They are running deep analytic queries on inventory levels and models that require heavy computations.

The role of analytics in an organization is changing and the considerations for deploying analytics are evolving. In Part II of this series, we will discuss the big data platform capabilities needed to support these changes.