The combination of Jupyter Notebooks, Apache Hadoop and Apache Spark has become a killer app for data practitioners. It unlocks the ability to explore, visualize and experiment with both structured and unstructured data sets with great ease and efficiency. We spoke recently with Chris Snow at IBM
SparkOscope helps Apache Spark developers take advantage of the job-level information available through the existing Spark Web UI; minimizes source code pollution; and extends the Spark Web UI with a palette of system-level metrics about the server, virtual machine or container related to each
Apache Spark, sometimes called the “analytics operating system,” is empowering organizations of all kinds through machine learning by helping them create unprecedented value from their data. Discover eight ways that Apache Spark’s machine learning capabilities are driving the modern business.
Data science seems to be experiencing a renaissance when it comes to advanced open source tools. Get a glimpse into creative application development with IPython Notebooks, Jupyter Notebooks, Apache Spark, the PixieDust open source library and more at IBM Insight at World of Watson 2016.
IBM extended Big SQL, which was formerly exclusive to the IBM Open Platform (IOP), to the Hortonworks Data Platform (HDP) in September 2016. I recently spoke with Berni Schiefer, an IBM fellow in the IBM Analytics group, to learn more about the offering and the ongoing IBM focus on SQL.
Historical application of vector mathematics and the study of unstructured text data can be an important approach to understanding and actualizing the value of data. See how mathematical exploration of text data can unearth insight that translates into enhanced decision making.
IBM Insight at World of Watson 2016, 24–27 October 2016, at Mandalay Bay in Las Vegas, Nevada, is the only place to be for people who work with data. Take a look at this list of top-ten reasons you wont’ want to miss out on one of the most intriguing and innovative events of the year.
Advances in tools and the capability to work with cloud-based data sets are dramatically changing the nature of data science workloads. Take a look at one data scientist’s quest to learn more about performing data science analysis in the cloud.
Nancy Hensley, director of offering management for IBM Analytics speaks with Rob Thomas, vice president of development for analytics, at IBM, on the subject of business transformation, leading to a discussion of the data maturity curve.
Despite big data’s hype, a significant number of organizations are still in a holding pattern—either locked in planning, hesitant to get started or wanting to avoid Apache Hadoop and Apache Spark projects. Complexity and a shortage of skills can exacerbate the situation. Increasingly, organizations
The concluding week of September 2016 offered much excitement in New York City, the backdrop for Strata + Hadoop World 2016 and several key IBM announcements, including the launch of a cloud-based, self-service environment for data science teams. Enjoy some key highlights captured from this
Apache Spark 2.0 and Apache SystemML are now available in IBM BigInsights for Apache Hadoop on Bluemix, accessible in the Basic plan under BigInsights for Apache Hadoop. Choose from among a wider array of capabilities than ever before as you take advantage of these additions to your cluster
Open data science initiatives can be a revolutionary force for innovation that spans diverse industries. And that force comes from the people in different roles and with various skill sets who use open source data science tools to develop and deploy new designs for working and living. Discover why
The productivity of data science teams—often challenged by access and formatting minutiae—can be enhanced by automating many of the manual tasks these teams need to process. Take a peek inside the mind of a data scientist, and see how acceleration of the data science development pipeline can boost