Analytics and the cloud: A perfect match

Executive Information Architect, IBM

This is the first in a sequence of blogs that takes a peek at what is driving analytics onto the cloud, what are the challenges that will need to be overcome over the next 5 years and how they will be tackled.

Data everywhere

We are fast approaching a point where much of the planet (including ourselves) will be instrumented, our thoughts / experiences captured through social media apps (text/audio/video), most documents created will be digitally stored somewhere. Anything that can be digitized will be. So how will we start to unravel all this data and make it work for the good of ourselves?

Analytics on everything areas we will see are making use of all the data we now have at our disposal around how a customer interacts with an organization and other groups (government, social, corporate and even personal devices). This will require new analytics to link all this data using differing forms of data stores to traditional RDBMS (Relational Database Management System) engine, such as graph databases (a subset of databases known as NoSQL) and new forms of analytics (through sophisticated modelling) to identify signals in that data. There are many examples such as the drive in governments to use such techniques to determine risk or fraud in tax returns through to sensor enabled humans to analyse, predict and suggest better ways to manage health of an individual. Much of these forms of analytics will be deployed in cloud-based solutions, I’ll explain why later in this blog.

Analytics is becoming more and more pervasive, Google now offer Google Analytics Premium, for their sellers. This offers many forms of analytics and features that allow analysis of visitors (whether buyer or simply browsing) to a seller site regardless of device, so it doesn’t matter how an individual accesses the services, PC, phone, tablet etc. the analytics can look across all platforms and understand how the customer interacts with the seller’s services more intelligently. Plus, this is all delivered on the cloud. These cloud-based analytics work closely with a customer-centric approach to ensure that businesses understand their markets. This goes well beyond Google and online businesses to traditional bricks and mortar retailers, banks, insurers and so on. The behaviors of empowered customers are driving influence.

Other companies such as Netflix (who analyze customers viewing habits and can control granularity of analytics down to sections of movies customer fast forward, rewind, pause on etc.). Amazon uses analytics for Amazon Web Services (AWS) are also building detailed analytics of customers purchases, habits and preferences. Note, Netflix simply uses AWS to store much of their data.

All these advances also bring risks; the number of unsatisfied customers using these services has begun to rise as the variety of ways to reach a customer through the use of such ubiquitous technology has increased. The mobile phone has been a major player in this ‘always on’ society. If data across multiple channels and supply chains isn’t joined up sensibly then when contact is made with a customer the potential to do damage to reputation and brand rises significantly, through ill-informed content. This can be severely compounded when users take to social media to voice their concerns loudly when such events take place! The analytics helps here to test new launches with specific groups of users, to simulate their responses or to simply very quickly (in real-time) identify if something is amiss. For example, online gaming companies make use of analytics extensively to simulate and model how a new game will perform to ensure customer satisfaction and also manage risk where any form of gambling on that platform takes place, i.e.: will the game make the returns expected.

The analytics we deploy must work across all channels to help ensure the customer view is seamless regardless of the way they elect to interact with the business. The continual rise of customer-targeted campaigning will probably lead to people finally identifying the value of their own data and begin to question how it is being used. Data is increasingly becoming a digital footprint of our lives to all intents and purposes and as such is a very valuable commodity to ourselves and businesses or governments. This may lead to a whole new set of regulations and privacy rulings coming into force where people can more easily opt out of supplying their data, and only allow access to their data where there is perceived benefit to themselves to do so (maybe a micro-payment for using data, or offering some additional services that wouldn’t be available otherwise). GDPR (a regulation adopted by the European Commission on 27 April 2016) is one indication of where things could be heading.

So why is the cloud become critical in the development of analytical solutions? One obvious point is that data volumes just keep getting bigger. The ever increasing need to store data will see rapid cloud data centre growth ( Security, encryption, audit, archive and recovery will become even more important and complex—regulatory control in some industries will enforce these requirements. Analytics to identify what to archive will be key, there is simply too much data to do this manually. Information Lifecycle Management could be automated by identifying how often data is accessed by the business vs the need to retain it. Storage will be moved to the cloud wherever possible, (rent rather than buy resource). In tandem with storage needs server workloads continue to grow, network bandwidth (the internet has an unquenchable thirst) and so on means there is a pressing need to further optimize the entire infrastructure. This thirst for data is simply becoming too great for a single body to maintain on premise, using dedicated hardware and software stacks. The use of multi-tenant or dedicated cloud-based solutions allows the purchase of resources to be based ‘on demand’ and driven by resource consumption rather than capital expenditure that has to be amortized over several years.

So the future is "on the cloud," but what are the trends that analytical solutions will need to be capable of dealing with over the next 5 years and beyond? This blog series takes a look at a number of themes:

  • IoT: The rise in sensors attached to just about anything will bring new challenges for analytics to make use of this data, alongside systems of record, systems of engagement, we now see systems of automation extending what is needed to create systems of insight. I’ll look at just what can be done today with things like Watson IoT.
  • NoSQL: These databases have been around for some time, what are they and where do they fit alongside traditional structured (SQL) engines, how do we extract analytics from schema less engines.
  • Open Source: How does open source impact the IT landscape – it already has! – Where will it go next – what are the issues behind using Open Source? What analytics are available on OS and where are they heading, especially in the realms of statistical modelling and Machine Learning.
  • Hybrid Clouds: Public, Private, On-/Off-Premise – how to navigate to an optimum solution for a business – where do analytics fit?
  • Cognitive: It’s everywhere we look these days, but what’s behind it, we take a peek under the covers – and understand how analytics has helped it to evolve
  • Bluemix: IBM’s Platform as a Service (PaaS) – its critical for the future, what can we do with it now!
  • Longer term: Hardware will once again come to the fore and revolutionize how we manage analytics, synaptic quantum processors will change how chips work and the speed they work at, dramatically opening the way for analytics to be embedded almost anywhere. I’ll take a look a 5-10 initiatives that will transform analytics and our lives over the next 5-10 years

Learn more about IBM Analytics technology