Big data and analytics trends in 2017: James Kobielus’s predictions
Every December, I publish my predictions of data industry trends for the upcoming year. Here are my forecasts for trends in big data analytics, data science, predictive business and cognitive computing in 2017. I’ve also included a few looks back at the year now ending and some long-range projections into the coming decades.
What was the most surprising new data industry trend of 2016 that you didn’t see coming?
I didn’t see Hadoop declining as rapidly as it did from the big data platforms landscape. Hadoop is still a mainstay of unstructured data acquisition, transformation, cleansing and queryable archiving, but its core components—MapReduce for massively parallel data processing and HDFS for data storage—are conspicuous by their absence from newer platforms. This trend is most visible in IBM’s newly launched Watson Data Platform, which uses Apache Spark instead of MapReduce and distributed object storage in lieu of HDFS. Although we’ll continue to see WDP and other next-generation cloud data platforms source data from Hadoop clusters, it’s clear that Spark is the centerpiece of the new cloud data services platform that’s geared to accelerating the productivity of development teams working on data science projects.
Which data industry trends do you expect to dominate in 2017?
- Traditional programmers will be required to gain data science skills in order to stay relevant, employable, and effective in their careers.
- High-priority enterprise application projects will focus on developing artificial intelligence (AI), machine learning and cognitive computing assets for production deployment.
- Disruptive enterprise application projects will focus on streaming media analytics, embedded deep learning, cognitive IoT, conversational chatbots, embodied robotic cognition, autonomous vehicles, computer vision and autocaptioning.
- Data scientists will hold operational responsibilities that focus on designing, deploying monitoring and managing real-world experiments, A/B tests, machine learning and predictive analytics assets inline to core business processes and customer touchpoints.
- Data scientists will work within integrated, multidisciplinary cloud-based development environments that incorporate standardized notebooks, access to deep algorithm libraries, composable containerized microservices, rich collaboration and project tracking tools, and robust security and governance controls.
- Open source tools focused on embedded deep learning and cognitive IoT will come into data app developers’ core workbenches, supplementing and extending R, Spark and Hadoop.
- Data scientists from nontraditional professional backgrounds, including self-taught “citizen data scientists,” will work alongside experienced data scientists on high-priority enterprise projects.
- Data scientists will establish their careers and boost their visibility in open competition communities such as Kaggle and Topcoder.
- More of the training data and curation in data science initiatives will come from crowdsourcing environments.
- More stages of the machine learning development pipeline will be automated through advances in unsupervised learning.
- Enterprise applications will evolve to leverage the value-added from AI, machine learning and predictive analytics.
- Enterprise applications will be built to run on intelligent, dynamic swarms of semi-autonomous drones.
Which of the most promising new data technologies do you expect to gain traction in 2017?
We’re going to see mass deployment of a new generation of optimized neural chipsets, GPUs and other high-performance cognitive computing architectures. These increasingly miniaturized components will be the foundation for most new cognitive mobile, cognitive IoT and cognitive cloud applications that come to market this year and beyond. As low-cost, low-power cognitive hardware platforms appear, more consumer products will incorporate them at attractive price points for mass adoption. Over-the-air or remote distribution of machine learning and other algorithmic artifacts, as well as security patches and updates, will become the standard approach. Cognitive algorithms will be compressed to run on hardware platforms that are resource-constrained, small-footprint, low-power and intermittently connected.
What data jobs, skills and roles will be in hottest demand in 2017?
People who can design AI-powered products that combine robotics, embodied cognition, IoT fog computing, deep learning, predictive analytics, emotion analytics, geospatial contextualization, conversational engagement and wearable form factors will be in hot demand. We are at the start of an amazing period of mind-blowing innovation in the consumer and industrial economy. Every human artifact is being retrofitted with AI capabilities or being designed from the ground up either to function as intelligent assistants or to handle many chores autonomously that their owners can’t or prefer not to do themselves.
Purely premises-based data platforms are vanishing as more organizations begin to rely completely on public cloud services such as Watson. Beyond that, it’s hard to identify an older approach to standing up data management and analytics that’s in danger of disappearing in 2017. The growing enterprise implementation of hybrid and zone architectures for their end-to-end data platforms has given even the most mature technologies—such as relational databases, OLAP cubes and columnar architectures—a new lease on life within well-defined deployment models and use cases.
Which data technologies and trends are the most overhyped?
It’s hard to say. Many emerging technologies—such as cognitive computing—have a long ramp-up to mass adoption. During that period, the promises may at times overreach and the still-immature implementations might disappoint some observers. Autonomous vehicles and drones have been hyped out of proportion to their mass adoption so far, so you could put them in this category. But when you see them demonstrated and occasionally play with them on your own, you also get a sense that they’re getting closer to commercial readiness. Hype is relative to the speed at which any innovation approaches this maturity tipping point.
What will the data industry landscape of 2020 look like?
Public cloud data and streaming analytics services will predominate. Petabytes of storage in these clouds will be quite affordable. Every service you might use in these clouds will incorporate cognitive computing and machine learning both at the application level and in its core deployment, monitoring, optimization and governance tooling. Data scientists will be the core developers of all of these cloud services.
How pervasive will cognitive computing become in the next 10 to 20 years?
AI and cognitive computing will be baked into every physical artifact of our lives, including our cars, homes, appliances and other possessions, and into every component of public infrastructure, even the physical environment. The entire planet—and even the solar system and interstellar space—will be sensed, monitored, managed and optimized by cognitive IoT. As we move deeper into the 21st century, humanity will deploy cognitive IoT across the universe as an extension of our organic nervous system, representing the realization of Arthur C. Clarke’s vision of "practical magic." Humanity may even develop the long-sought (or long-feared) “master learning algorithm” by the middle of this century, though I personally feel that we’ll see sustainable nuclear fusion before we get to that point.
By the way, here are my cognitive analytics industry predictions from a year ago. As near as I can tell, I was fairly prescient. Among other proof points, check out this recent blog of mine discussing the key cognitive analytics announcements at IBM World of Watson.
Want to share your predictions for big data, analytics, cloud data services, data science and cognitive computing in 2017? Join me and other industry experts for a live #MakeDataSimple Crowdchat on Thursday, December 8 at 1:00 PM ET. Click here to register for the event or simply browse to #MakeDataSimple to follow the discussion.
And please visit this site to learn how you can use the Watson Data Platform to put cognitive business to work in your business in the new year and beyond.