In the latest release of IBM Cloud Pak for Data, v2.5 has three key themes: Red Hat integration, new key built-in capabilities like Watson tools and runtimes, and a heavy focus on open source .(https://www.ibmbigdatahub.com/blog/announcing-cloud-pak-for-data-2-5). Open source is widely adopted in
In a previous blog, I explained how data science capabilities, massive parallel processing (MPP)
and usability improvements in data warehouse appliances can help the bottom line—and why old-fashioned architectures might not cut it. But what does that look like in practice?
Research firm Quark +
In the connected world of today’s digital economy, apps, IoT devices, vehicles, appliances and servers are generating endless stream of event data. The stream of events describes what is happening over time and offers the opportunity to track and analyze things as they happen.
Data is a potent business resource and the key to gaining and maintaining competitive advantage. Last month, IBM and Hortonworks announced a partnership to bring data science to the world on an open platform, offering Hortonworks Data Platform (HDP) along with IBM Data Science Experience (DSX) and
Recently, I had the honor of speaking with a number of the world’s most influential thought-leaders in the fields of data science, data analytics, machine learning and digital transformation. This group of prominent data technologists was more than happy to answer a wide variety of question on
In any successful modern organization, analytics is likely to play a central role in helping decision-makers design and execute effective business strategies. At IBM, as we work with clients across the globe, we’re seeing ever-increasing levels of maturity and confidence in data-driven business
Most of us are privileged enough to live in societies where clean water is a given. When our physicians and the media advise us to stay hydrated, it’s a simple health consideration. But in many countries, access to water is a huge issue – approximately one third of the world’s population does not
For today’s data scientists and data engineers, the data lake is a concept that is both intriguing and often misunderstood. While there are many good resources about data lakes on ibm.com and other websites, there is also a lot of hype and spin. As a result, it can be difficult to get a clear
Building a data lake is one of the stepping stones towards data monetization use cases and many other advance revenue generating and competitive edge use cases. What are the building blocks of a “cognitive trusted data lake” enabled by machine learning and data science?
This white paper discusses the advantages of using the PySpark API, which enables the use of Python to interact with the Spark programming model. It starts with a basic description of Spark and then describes PySpark, its benefits, and when it is appropriate to use instead of "pandas" open source
J White Bear is a data scientist and software engineer at IBM. In this podcast, White Bear discusses simultaneous localization and mapping, an ongoing research area in robotics for autonomous vehicles and well-recognized as a nontrivial problem space in both industry and research.