Cloud Analytics: Derive Insight Without On-Premises Infrastructure
Cloud computing accelerates and broadens services delivery of big data analytics
Individually, cloud computing and big data analytics have been acknowledged as disruptive technologies. Combined as cloud-based analytics, these technologies have the potential to not only accelerate and widen their adoption, but they ultimately can transform organizations, entire industries, and even society.
Cloud analytics covers a wide range of offerings, including most notably what is referred to as analytics as a service. Cloud analytics also comprises the use of cloud services for an individual component such as storage or visualization, a comprehensive cloud-based analytics platform, and an analytics ecosystem that monetizes data for an industry. Examples extend from a simple Twitter data-feed service to an entire analytics ecosystem such as GE Predictivity or IBM Watson™ software.
Cloud analytics provides many benefits, including convenience, ease of implementation, and little or no up-front spending on infrastructure. However, there are currently many challenges that line-of-business users and providers face with security, data gravity, and integration among others.
Teaming up technology
Data quantities are exploding beyond wild imagination, growing at a 40 percent compound annual growth rate, and they are projected to reach nearly 45 billion TB by 2020.1 Organizations that can make effective use of this rising data can gain significant advantages. They are realizing that big data analytics can provide them with ways to greatly enhance service for their customers, help improve their operational efficiency, help reduce cost, and most importantly, to innovate.2 The opportunity of cloud computing—with its relatively easy path of provisioning services combined with the challenges of implementing analytics solutions—makes these technological advances a fitting match.
Historically, the notion of outsourcing the analysis of data has been around for a long time. In 1890, Herman Hollerith won a contract from the US Census Office to process census data using his newly invented tabulating machines.3 In a more recent example, the Australian Bureau of Statistics (ABS) in 2006 enabled the online dissemination of statistics based on a desktop self-service model.4 And today there are extensions of this model in President Barack Obama’s Open Government initiative and by many other governments around the world. Look for an upcoming IBM Data magazine article that discusses how providing target data for analytics is one—albeit a simple one—form of cloud analytics patterns.
Provisioning cloud-based analytics
Cloud analytics is the use of cloud computing to provision one or all elements of an end-to-end analytics solution.5 These elements may include data sources, data models, processing applications, computing power, analytics models, and storage.6 With this definition of cloud analytics, most big data analytics solutions are cloud analytics because the majority of big data is generated or stored on cloud-based platforms. Examples include data generated in social networks and from devices such as mobile devices, cars, and radio frequency identification (RFID)–enabled devices.
There are many vendors that provide a wide range of cloud analytics offerings such as business intelligence (BI) as a service, social media analytics, and many more. These vendors include emerging niche solution providers such as Tableau and Experian and large end-to-end solutions providers such as IBM and Teradata. Some of these offerings are not new; they are repackaged, existing products made consumable as a service from the cloud. Although analytics software functionality available through cloud deployment models currently represents a small portion of the total analytics market, it is rapidly growing at a much higher rate than that of on-premises software.7
Benefits and drivers
Cloud analytics can offer enhanced agility, scalability, and reduced total cost of ownership (TCO). In essence, it provides a means of democratizing access to analytics by making access to advanced analytics cost-effective for small organizations that cannot afford the infrastructure required for an on-premises implementation. The benefits of cloud analytics include cost-effective rapid deliverability, agility and elasticity, and accessibility and collaboration.
A significant benefit of cloud analytics may be economic advantage. Some cloud analytics offerings claim to provide a cost reduction that can be 10 times the cost of implementing equivalent capability in an on-premises infrastructure.8 Not only can cost be reduced, but the rising implementation of cloud services helps keep costs down. In one well-publicized cloud analytics case study, The New York Times in 2007 used 100 Amazon EC2 instances to process 4 TB of raw images into 11 million finished PDF files in 24 hours.9 Compare this result with the cost and effort of doing the same task using on-premises infrastructure, and the high value of cloud analytics is quite clear.
By helping eliminate typically long procurement, installation, and deployment periods, cloud analytics solutions can be implemented in a fraction of the time it takes to implement on-premises solutions. Another key advantage of cloud analytics is the capability to leverage its elastic computing or burst resources, especially when algorithms demand aggressive assignment of storage or computing resources. This elasticity facilitates the rapid scaling of solutions from a pilot project to a full production deployment that enables the agility of cloud computing.
Information on cloud platforms is readily accessible using a browser or a mobile device, and it is therefore sharable with anyone who has connectivity. Collaborating with business partners, customers, and even colleagues within the same organization can be highly efficient when using cloud-based technology.
Problems and challenges
Notwithstanding its benefits, line-of-business users and providers face many hurdles when implementing cloud analytics. Some of these challenges include integration, security, compliance, and data gravity.
Integration can be either a benefit or a challenge in cloud analytics, depending on the location of the data. Integrating data already on the cloud, such as data originating from social networks or software-as-a-service (SaaS) solutions, is relatively easier than it is in on-premises implementations. However, integrating cloud analytics with enterprise data can be challenging. In general, cloud-based solutions usually handle distributed and disparate data, and therefore they tend to have more elaborate integration requirements.10
Security is by far the most frequently cited concern related to the adoption of cloud solutions in general, and cloud analytics in particular. This concern is not surprising because many analytics solutions need to access sensitive customer, product, or employee data that has to be protected from unauthorized access. Many organizations are understandably quite reluctant to place this data in the cloud or provide access to it across the enterprise firewall.
Another challenge that relates to the location of data and security is compliance with regulations concerning specific sensitive data that must reside within certain jurisdictions. This obstacle is forcing many cloud providers to build local points of delivery.
In his December 2010 blog post, “Data Gravity – in the Clouds,”11 Dave McCrory introduced the term data gravity to describe the increasing likelihood that additional services and applications will be attracted to data as it accumulates, or develops mass (see figure). This possibility arises because of the inevitable delay—latency—in accessing remote data, which motivates the placement of processing and, by extension, smaller data sets where the largest data set resides. The result is constant and potentially very costly data transfers.
Data gravity resulting from the tendency of large data sets to grow by attracting smaller data
Helping simplify access to analytics
The combination of cloud computing and big data analytics is setting the stage for widespread adoption of cloud analytics and enabling organizations to capitalize on the benefits of analytics without investing in on-premises infrastructure. Stay tuned for a series of upcoming features in IBM Data magazine that extend this perspective with a discussion of cloud analytics taxonomy, use cases, and integration patterns.
Please share any thoughts or questions in the comments.
1 Data derived from an Oracle-sourced graph in “Big Data and the Creative Destruction of Today's Business Models,” Christian Hagen, et al., AT Kearney, 2013.
2 “Deriving Innovation from a Data-Driven Mind-set,” by Ahmed Fattah, IBM Data magazine, January 2014.
3 US Census Bureau, 1890 Overview history page.
4 Census TableBuilder, Australian Bureau of Statistics.
5 Cloud Analytics, Techopedia.
6 Cloud Analytics, TechTarget.
7 Fastest-Growing Category of Cloud Computing: Business Intelligence and Analytics,” by Joe McKendrick, Forbes, July 2012.
8 “Amazon Launches Cloud Database with Analytics Tools, Lowers S3 Pricing,” by Brandon Butler, NetworkWorld, November 2012.
9 “Self-Service, Prorated Supercomputing Fun!” by Derek Gottfrid, The New York Times Open blog, November 2007.
10 The upcoming IBM Data magazine article, “Cloud Analytics: Integration Patterns for Business Solutions,” provides a detailed discussion of cloud analytics integration patterns and integration challenges.
11 “Data Gravity – in the Clouds,” by Dave McCrory, McCrory’s Blog, December 2010.