Blogs

The future of AI looks bright with the IBM and Cloudera Partnership

Product Marketing Manager for Data Lake & Cloudera Partnership, IBM

In 2014, Stephen Hawking warned that artificial intelligence could “end mankind.” Over the years, many sci-fi films, from 2001: A Space Odyssey to I, Robot have alluded to the same potential perils. Yet, five years after Dr. Hawking’s warning, the positive impact of AI is already evident for both consumers and businesses.

While consumers delight in the functionality of their virtual assistants, enterprise companies are using AI for greater productivity and efficiency by improving customer relations, proactively detecting and mitigating fraud, and providing new levels of innovation.

Still, AI cannot exist without the right foundation.

AI starts with data collection

Tremendous data growth – fueled by new sources of semi-structured and unstructured data including social media, streaming video, and clickstreams – has made collecting and analyzing data the biggest challenge for most businesses. This means data can sit unused unless the proper architecture has been put in place.

A hybrid data management foundation can help overcome this challenge by providing the ability to select any type of database or data warehouse, choose best-of-breed and open source software, and leverage existing investments. This is key when aggregating disparate data because it allows you to collect all data types on the best platform for them to drive the most complete insights.

AI requires governance, scalability, and security

Of course, collecting data is only part of what it takes to establish an AI foundation. The data collected must also be governed in such a way that clean, relevant data can be found and used easily. Selecting the right governance and integration tools will help ensure data is properly cleaned, classified and protected in timely, controlled data feeds that populate and document it with reliable information assets and metadata. This will make using data for analytics and AI more efficient and effective.

Solutions supporting AI also need to be highly scalable in terms of both the amount of data that can be taken in and the number of users on the platform. Thousands of users should be able to access petabytes of diverse data without issue to help them find insights on a wide range of subjects. In addition, enterprise grade security must always be a top priority. Businesses should use multiple layers of security with an emphasis on complete auditability so that companies can prevent unauthorized data access and demonstrate accountability for actions taken. Not doing so could introduce greater risks than the benefits of AI can offset.

AI is supported by quick data access and data science tools

Making data accessible, regardless of where the data lives or how it is structured, enables your data users, data scientists, line-of-business owners and developers to make quicker and more accurate data-driven decisions. Businesses should be able to query data across the company easily using a SQL engine for Apache Hadoop that is capable of concurrently exploiting Hive, HBase and Spark with a single database connection or query. Active-active data replication for Hadoop also can provide the option of replicating big data from lab to production, production to disaster recovery sites, or ground to cloud object stores.

For analysis and more direct support of AI, industry-leading engines should be used to process and query data, as well as develop and serve models quickly. Data science tools should also be employed to accelerate machine learning (ML) and deep learning workflowsfor AI. The best tools can be used to build and train AI and ML models in a single environment, and prepare and analyze data with automated deep learning and enhanced visual modeling. Not only will this help provide the performance AI demands, but also it will encourage collaboration within the data science environment as well.

The IBM and Cloudera strategic partnership

Because of the need for data collection, governance, data access, and data science tooling, some mergers and partnerships have occurred in the marketplace to deliver a robust set of hybrid data management options. On January 3, Cloudera, the enterprise data cloud company announced the completion of its merger with Hortonworks, Inc. Then on May 28th, the partnership between IBM and Cloudera also became official.

The combined expertise and solutions of these companies offers customers a direct path to AI. Clients get easy access to the capabilities, scalability and economy of a modern data platform; additional governance and security features; and the tools for data federation, advanced query and data management. The result is an open-source data and analytics solution that’s ready for the enterprise—and ready for the AI-infused future.

Together, IBM and Cloudera offer a modern data platform with the governance and security to drive the future of AI and ML. Our solutions are optimized for the cloud, but we give our customers options to put their data where it works best for them—on-premises, or in public, private or hybrid clouds. We provide the tools needed to better capture, store, explore and manage big data and offer packaging, licensing options to simplify the process and ensure future proofing of for growing organizations.

For more information on IBM and Cloudera’s ability to help businesses get the most out of AI, visit our IBM and Cloudera partnership page.