Koby’s Big Data Predictions for 2013
Some of my colleagues group-posted their big-data predictions for 2013 while I was taking a holiday break. Now all rested and relaxed, I submit for general perusal a few predictions of my own:
- Hybrid big-data deployments will become the standard: In spite of what you may have heard, Hadoop is not the sum total of big data, though it will play a growing role in 2013 and beyond. Increasingly, users realize that no one type of big-data platform will be optimal for all requirements. This year and through the end of the decade, organizations will deploy a growing variety of hybrid big-data architectures that combine Hadoop, in-memory, NoSQL, stream-computing, massively parallel RDBMS and other platforms to support a wider range of sources, workloads and applications. Many of the newer approaches will find their first significant enterprise foothold in the big-data staging tier, processing a dizzying range of new unstructured data types.
- Cross-scale data architectures will predominate: Debates about the distinction between “big data” and “small data” have become academic and tiresome. Most users don’t care. In 2013, they will expand their adoption of all scales of data management and analytic platforms, integrating them into unified business resources. The economics of storage will dictate the shape of these all-scale data architectures. Generally, organizations will deploy the more expensive real-time data platforms closer to user applications, which require local storage in the lower terabytes. They will deploy cheap storage into lower-volume (low terabytes), higher-velocity (real-time) platforms closer to user applications, while deploying the higher-volume (petabytes), lower-velocity (batch) platforms closer to the data sources. The staging, landing, or pre-processing tier will be where most real-world deployment of peta-scale Hadoop, NoSQL, graph and other specialized databases will take place. The RDBMS-based enterprise data warehouse (EDW) will increasingly anchor the hub tier, midway in scale between the nouveau big-data platforms in the staging tier and the in-memory platforms in the access, mart and exploration tier.
- Governance will become a prime focus of maturing big-data deployments: The novelty factor of Hadoop and other big-data approaches is wearing off in the business world, and organizations will deploy more official systems of record on these platforms in 2013. As the adoption of big-data “single version of truth” expands, organizations will evolve their data governance practices to efficiently handle new data sources, more complex data schemas, and greater volumes, velocities and varieties of governance workloads. At the same time, more organizations will look for automation tools and repositories to support governance of the expanding pool of MapReduce, R, machine learning and other big-data analytic models being produced by their data scientists.
- Data science centers of excellence will spring up everywhere. As the role of data scientists grows in the business world, more organizations will establish internal centers of excellence in 2013 to foster standardization, reuse, collaboration, governance, and automation within and across advanced analytics initiatives. These centers of excellence will leverage and extend established analytics best practices. They will provide a convergence point for statistical analysts and subject-matter experts looking to share their expertise. They will also provide forums and resources for long-time BI and data management professionals to enhance their skills in hot new areas such as text mining, sentiment analysis, social network analysis, behavioral analytics and ensemble modeling. These centers of excellence will also evolve into lifelong learning program for business analysts to acquire full-blown data-science skills. These programs will make a significant dent in the talent gap that hamstrings many of today’s business big-data initiatives.
- Next-best-action deployments will become more cross-application: Next best action will deepen its growing impact as the “killer app” for big data. Traditionally, next best action has been a capability embedded in siloed applications supporting enterprise customer service, marketing and other “front-office” requirements. However, the past several years have seen growing adoption of next-best-action standalone infrastructure that spans diverse customer-facing and back-office applications. In 2013, we will see more enterprises align and converge their investments in big data, predictive analytics, business process management, business rules management, stream computing, decision automation and other technologies into general-purpose next best action infrastructure. In addition, more enterprises will rely on vendors such as IBM to provide packaged next-best-action solutions that incorporate data schemas, predictive models and other artifacts that gear the infrastructure to support key big-data initiatives, including multichannel offer targeting, marketing campaign automation and customer experience optimization.
And of course I could offer many more predictions. But I’ll save them for future IBM blogs, quick-hits, podcasts, tweetchats, articles, speeches and other big-data evangelization activities in 2013.
Good to be back. Happy New Year!