In-Memory: The Lightning in the Big Data Bottle

Big Data Evangelist, IBM

In-memory architectures are velocity with a vengeance. With business’ never-ending demand for more speed, widespread adoption of in-memory platforms is inevitable. As the cost of dynamic random access memory continues to decline, in-memory environments will become the lightning-fast data platform for all applications.

lightning.jpgIn-memory technologies will almost certainly pervade most components of big-data analytics platforms by the end of this decade. Inexorably, big data will occupy ever-growing pools of virtualized memory that span many servers in the cloud, as well as on desktops, mobile devices and other clients. By the end of this decade, petabyte memory clouds will become more feasible and budget-friendly. And as next-generation CPUs, with 1000s of cores, expand the addressable memory, we’re likely to see dozens of TBs of RAM per server as a mainstream technology in that same time frame.

Nothing screams “speed of business” like in-memory architectures, which support the astonishingly fast fabric of modern life along five principal low-latency dimensions:

  • Speed of thought: In-memory is a knowledge-worker productivity booster of major proportions. Speed of thought has become the core design criterion for in-memory applications in support of business intelligence. Once you’ve tasted the speed-of-thought advantages of in-memory, you won’t turn back. Even in the so-called “small data” arena, query performance acceleration is the key to knowledge-worker productivity, and in-memory—especially when combined with columnar—has shown clear advantages over rotating storage media, especially when used to support structured, repeatable scan and queries against very large aggregated tables. In fact, query-speed advantages have driven in-memory columnar databases and solid-state drives into the analytics mainstream, owing to these technologies’ ability to ramp up the input-output operations per second by an order of magnitude. Most knowledge workers have come to expect advanced visualizations with sub-second refresh rates. For more of us, if access speeds, query responses and load times are not instantaneous, we notice, and our productivity suffers.
  • Speed of discovery: In-memory tools are a core power tool for data scientists, subject-matter experts and any other professional whose job is to discover non-obvious patterns in deep data sets. Their productivity is directly tied to their ability to rapidly visualize complex relationships, build and score analytic models, and evaluate various scenarios. These people need power tools that can keep up with the breakneck pace at which they cycle through alternate models. Ideally, the back-end in-memory infrastructure should leverage machine learning to discover patterns that even the most skilled human experts overlooked. Another nirvana vision is when the in-memory fabric, silently and behind the scenes, pre-fetches all the data, visualizations, algorithms and apps it anticipates the data scientist might need and automatically pushes them down to a local in-memory cache in their modeling tool.
  • Speed of transactions: In-memory’s advantages in transactional computing are compelling. Any business that can achieve even an incremental improvement in transactional speed can exploit these operational cost efficiencies in the competitive arena. You could gain such a performance boost in transaction processing that you would free up capacity to engage in ever more complex value-added transaction processing. And if you can achieve an order-of-magnitude boost in transactional productivity, thanks to in-memory database technology, you can truly dominate in your industry.
  • Speed of response: In-memory infrastructures could deliver an order-of-magnitude improvement in the responsiveness of next-best-action environments. If you architect your business processes with an in-memory data platform, you can help to ensure that the very best data, models and rules are delivered at every moment to the appropriate decision-automation execution points. For example, customer churn mitigation is one example of a time-sensitive decision that often needs a fast response driven by the very best data and models available at each split-second. From a business standpoint, the chief advantage of in-memory architectures is that they eliminate the data/analytics bottlenecks that might prevent you from taking the best course of action at whatever speed-of-business a crazy world throws your way.
  • Speed of optimization: In-memory infrastructures, if implemented end-to-end within the infrastructures of modern life, could potentially support continuous end-to-end optimization. To the extent that we instrument our lives with distributed memory, real-time sensor grids, automated feedback remediation loops, embedded decision automation, self-healing network-computing platforms, and other analytic- and rule-driven systems, we can build a self-optimizing Smarter Planet. In this way, in-memory architectures could foreshadow the even more astonishing speed of complex global optimizations that quantum-computing environments will enable. If we put on our thinking caps, quantum-powered big-data analytics might allow us to do, instantaneously and continuously, complex real-time optimizations involving zillions of variables and simultaneous calculations. The most demanding of today’s multi-scenario constraint-based optimization challenges would give way to seemingly unlimited processing power.

It’s too early to speculate when rotating storage media will be entirely obsolete for all applications. But you would be hard-pressed to find anyone who believes that best-of-breed big-data analytics platforms of the year 2020 will be anything but all-in-memory.

Join the conversation about "in-memory" infrastructure

Join us March 27 for what is sure to be a lively Twitterchat focusing on in-memory data and infrastructure. Just hop onto Twitter at 12:00 noon ET and join or follow the conversation using #bigdatamgmt.

Related information

  • To find out more about managing big data, join IBM for a free Big Data Event