Juggling a growing analytics workload: The hybrid data warehouse architecture

Director, Offering Management, IBM Database and Data Warehouse, IBM

It seems from my years of working with technologies, that there is never a “one size fits all” solution to any need. Such is the case with data warehousing, which has seen huge changes over the years. The present is no exception. Profound changes in the way we generate and consume data must drive new trends in data warehousing. Why?

Data warehousing calls for flexibility

We all know that our data types and volumes continue to proliferate. We are also seeing that all roles across businesses of every size want to leverage data for better outcomes. What’s more, they are not going to wait—they want these new analytics services now. You need a more flexible and agile approach when it comes to data warehousing. The traditional data warehouse excels at meeting core analytic functions, but it alone cannot handle the onslaught of new data and the analytics that go with them.

When you must expand into new types of analytics, you should consider some additional deployment options. There are data warehousing services on the cloud that are ideal for data born on the cloud, self-service sandboxes and more. While you can deliver new services quickly from the cloud, you may need to prepare complex and interdependent data and application components before they can go to a managed service on the cloud. In some cases, you may want tighter control of your data than a managed service, yet you still need the flexibility of a cloud solution. And so now a middle ground is springing up that gives you flexibility and speed in deployment, yet can be used on-premises in a private cloud or on other infrastructures. This is a software-defined environment (SDE) that relies on container technology to “forklift” a ready to use data warehouse into your environment without all the provisioning headaches.

Hybrid solutions are addressing a full range of problems

With three different deployment choices at your fingertips—appliance, private cloud and managed service—you can implement a hybrid data warehouse architecture that enables you to target the needs of different analytics workloads across this range of options. These deployments enable you to match the workload to suit the data and also current processing demands. In other words, you have the tools to easily juggle workload as needs evolve. This is a hybrid data warehouse architecture, and according to Wikibon, hybrid cloud strategies enable better value creation and shorter payback periods as compared to an all public cloud strategy.

If this all sounds great but you are wondering where to start, here are some use case ideas to get you thinking about how to get started with a hybrid data warehouse model:

  • Rapid prototyping and agile development
    When your need is immediate but short-lived and you don’t want to lose time in infrastructure provisioning, either public or private cloud is an option. 
  • Skills and component reuse
    Using the same technologies across your deployment options gives you the flexibility and agility to meet changing needs. When each of your deployment options is based on different technology from multiple vendors, it is hard to achieve this flexibility. And the same goes for the skills to run them.
  • Analytics on operational and transaction data
    Do you shy away from doing analytics on your highly-tuned transaction system? Instead, you can pull your data into a cloud or private cloud data warehouse and run analytics there.
  • Test environment
    When you are building and testing new code, you probably do not want to be on your production system. Avoid the complexities of setting up a new environment and go instead to cloud or private cloud.
  • Deeper insights from more data
    What if you want to pull in public data sets to analyze with your own “system of record” data? Then a hybrid solution can house “born on the cloud” data for you, along with querying tools that reach across both data sources.
  • Performance
    When you need cost-effective high performance computing power to address peaks in demand, public and private cloud deployment options can help you scale up.

I believe that 2016 is going to be the year of the hybrid data warehouse—as organizations of all sizes expand their analytics services breadth. IBM solutions in this area include the IBM PureData System for Analytics, IBM BigInsights for Hadoop, IBM dashDB managed service, and now in preview is dashDB Local as a software-defined environment for private clouds and more. In particular, I hope you will take the time to test this private cloud preview and give us your feedback.