Banking on Hadoop

Global Banking Industry Marketing, Big Data, IBM

Why bankers should care about Hadoop

If you’ve been at even the periphery of banking technology in the last few years you’ve heard about Hadoop. But if you work on the business side of banking, you may think “Sure, I’ve heard of Hadoop, but why should I care? Technology is IT’s job. It’s the end result that I’m looking for.” That thinking has its validity: the driver of a car isn’t required to know the inner workings of the engine, as long it gets him where he wants to go. However, it can be handy to understand the capabilities of the car. If you have just five gallons of fuel left in the tank and 200 miles to next station, you may prefer the capabilities of a small hybrid over a muscle car, assuming a long walk is not in order. In the same way, it can be beneficial for bankers on the business side to be aware of the capabilities certain technologies can provide. 

Let’s focus on the new business capabilities that have been enabled by Hadoop. But don’t fret—we won’t go beyond a quick description, and will leave the inner workings of Hadoop to our capable technologists.  

 Apache Hadoop is:

  • An open source software project that enables the processing of large data sets across clusters of commodity servers
  • Designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance
  • Very scalable and, of great importance to this discussion, very “flexible”
  • Enhanced with the ability to absorb any type of data, from many different sources. This data can then be joined and aggregated in many ways, enabling deeper analyses than any one system can provide.
  • Unlike current relational database systems in that it can handle all types of data,and is especially known for its ability to handle unstructured data, like emails, web pages, audio or free form text.
  • The cornerstone technology that is driving today’s big data revolution

Let’s look at two of the many new capabilities in which Hadoop plays a key role.

Leverage more data for deeper analysis

Hadoop plays a big part in the capability to analyze more data than ever before to derive deeper business insight. In the past, when banks wanted to generate insight to support their decision making process (such as determining which customers would accept an offer for a new product) they would collect a statistically relevant sampling of information, analyze that subset to model a customer profile that would be the ideal purchaser, and then send an offer to those that matched the model customer. Then along came Hadoop. Now banks have the technology and capability to analyze ALL customer information from ALL sources, which can yield much more robust insight that supports more accurate predictive models. 

For example, a card unit of a large bank wanted to improve their offer acceptance rates for balance transfer campaigns. With their previous approach of analyzing a subset of customer data, they were getting unsatisfactory response rates on transfer offers. With the capabilities supplied by Hadoop and other big data technologies, they are now able to analyze their entire customer data set. They now can generate much more relevant and accurate insights upon which to base their credit card offers, resulting in significantly improved offer acceptance rates.

Analyze data not often leveraged

For many years, banks have limited their analysis of data in both type and source. If we stay with the analysis of customer data as in the above example, banks have typically relied on historical transaction data, demographic data and customer profile data for insight. As we indicated earlier, Hadoop can assist in the analysis of many different types of data from many sources. It is now possible, from a speed and cost perspective, to analyze other key types of data from a variety of sources. There is a wealth of insight that can be derived from additional customer data not often analyzed, and much of it is in unstructured or semi-structured form. Here are several often untapped sources of great customer insight:

  • Attitudinal Data: Opinions and preferences stored in a variety of systems 
  • Behavioral Data: Payment history or channel usage
  • Interaction Data: Email and chat transcripts, web clickstreams or contact center notes

For example, a European bank wanted to improve their cross-sell success as well as increase customer retention. The bank believed they had a wealth of customer interaction data that they were not currently leveraging, and felt they could increase cross-sell success by improving customer insight through analysis of this unused data. They employed IBM BigInsights, IBM’s distribution of Hadoop, to ingest unstructured and semi-structured customer interaction data from the contact center in the form of CSR notes, branch advisor notes and emails the bank received from customers. By feeding this additional insight from interactions into their predictive models, the bank was able to improve the targeting and timing of cross-sell offers that resulted in an increase in wallet share and a significant reduction in customer attrition.

Bankers should brainstorm how these and other new Hadoop and big data capabilities could add transformational value in their key business use cases through improved insight. Then join with their IT counterparts to select one or two use cases where the application of Hadoop technology will show the fastest, most impactful benefit. Learn more and be part of the Hadoop and big data conversation, here on the IBM Big Data & Analytics Hub. If you are attending Hadoop Summit 2014, stop by the IBM booth and listen in on these speaking sessions.