Although NoSQL database technology has been around for a long time (before SQL actually), not until the advent of Web 2.0, when companies such as Google and Amazon began using the technology, did NoSQL’s popularity really take off. Market Research Media forecasts NoSQL Market to be $3.4 Billion by
Building a data lake is one of the stepping stones towards data monetization use cases and many other advance revenue generating and competitive edge use cases. What are the building blocks of a “cognitive trusted data lake” enabled by machine learning and data science?
In many cases the data lake can be defined as a super set of repositories of data that includes the traditional data warehouse, complete with traditional relational technology. One significant example of the different components in this broader data lake, is in terms of different approaches to the
This white paper discusses the advantages of using the PySpark API, which enables the use of Python to interact with the Spark programming model. It starts with a basic description of Spark and then describes PySpark, its benefits, and when it is appropriate to use instead of "pandas" open source
This is the second in a series of blogs on analytics and the cloud. We will consider the rise of the Internet of Things (IoT), analytics used on that data and how the cloud can be utilized to drive value out of instrumenting a very wide range of ‘things’.
Fundamentally, machine learning is a productivity tool for data scientists. As the heart of systems that can learn from data, machine learning allows data scientists to train a model on an example data set and then leverage algorithms that automatically generalize and learn both from that example
J White Bear is a data scientist and software engineer at IBM. In this podcast, White Bear discusses simultaneous localization and mapping, an ongoing research area in robotics for autonomous vehicles and well-recognized as a nontrivial problem space in both industry and research.
In cognitive computing era, new revenue generation stream has emerged with data at center of the modern digital business model. One of the key capabilities cognitive computing enables for an organization is the ability to generate additional revenue streams by using data effectively. In the big
IBM’s community of big data developers continues to grow. As our Big Data Developer meetup program moves into its fifth year, this worldwide community of customers, partners and IBM developers is on the verge of enlisting its 100,000th member—when we published this blog, we counted 99,100.
Seth Dobrin is vice president and CDO, IBM Analytics, platform development, at IBM. In this podcast, Dobrin shares experiences using Apache Spark for data science transformation and some thoughts on a larger vision for data science transformation at scale.
In this white paper, discover how programmers and data scientists can use SparkR to transform R into a tool for big data analytics, taking advantage of parallel processing and near-linear scaling to tackle much larger challenges than would normally be possible with other methods.
The grand finale of the first IBM France Sparkathon invited Apache Spark developers to outthink the frontiers of client insights. Get the details on this event held during the IBM Business Connect conference and the application that took the top prize.
Analyzing streams of big data in real time can have a big impact on competitive advantage. In a world of bewildering stream processing engine choices, explore the use-case-dependent alternatives that can provide well-suited business outcomes, courtesy of expertise from Roger Rea and Jacques Roy.