In this white paper, discover how programmers and data scientists can use SparkR to transform R into a tool for big data analytics, taking advantage of parallel processing and near-linear scaling to tackle much larger challenges than would normally be possible with other methods.
Holden Karau is a software engineer at IBM, an active open source contributor and coauthor of Learning Spark (O'Reilly Media, February 2015) and the soon to be released High Performance Spark (O'Reilly Media, March 2017). In this podcast, Karau examines how to effectively search logs from Apache
Nick Pentreath is a principal engineer at IBM, a member of the Apache Spark project management committee (PMC) and author of Machine Learning with Spark (Packt Publishing, December 2014). In this podcast, Pentreath covers the basics of feature hashing and how to use it for all feature types in
Today’s businesses need a culture of collaboration that empowers knowledge workers to glean cognitive insights from data that help transform and modernize operations. See how cloud-based platforms and solutions enable data scientists and other experts to exploit artificial intelligence, machine
Emily Curtin is a software engineer at The Weather Company (now IBM) working on the data engineering platform team. Robbie Strickland is vice president, engines and pipelines, IBM Watson Data Platform, at IBM. In this podcast, they give a technical overview of how Parquet works and how recent
Now that we’re into the swing of 2017, the time is ripe for the first CrowdChat of 2017 to explore the goals, challenges and strategies that CDOs and CIOs are focused on for their organizations. Get involved and share your thoughts in this kick-off IMB Big Data CrowdChat.
Businesses have come to expect that smart rivals wielding digital technologies will disrupt their competitive landscapes. How ready is your organization to be a digital disruptor? Take a look at detailed criteria for assessing your organization’s readiness and the strategic steps you can take to
We might not all be nuclear physicists, but some of us are. Take Dr. David Farley, for example, who is a principal member of Sandia’s technical staff, with the Department of Energy. David, who works with some of the world’s brightest minds in the fields of nuclear energy and security, is a
If simplicity can fundamentally accelerate focused action, then you can significantly boost speed, productivity and effectiveness in your enterprise. Take a look at this overview of key announcements unveiled on the first day of IBM Insight at World of Watson 2016.
The combination of Jupyter Notebooks, Apache Hadoop and Apache Spark has become a killer app for data practitioners. It unlocks the ability to explore, visualize and experiment with both structured and unstructured data sets with great ease and efficiency. We spoke recently with Chris Snow at IBM
Data science may be the en vogue profession descriptor for what many who work with statistics and statistical modeling do, but being a data scientist in this era requires a unique set of skills and experiences. See why data science may not be a crystal ball capable of predicting all events, but how
IBM Insight at World of Watson 2016, 24–27 October 2016, at Mandalay Bay in Las Vegas, Nevada, is the only place to be for people who work with data. Take a look at this list of top-ten reasons you wont’ want to miss out on one of the most intriguing and innovative events of the year.
Advances in tools and the capability to work with cloud-based data sets are dramatically changing the nature of data science workloads. Take a look at one data scientist’s quest to learn more about performing data science analysis in the cloud.
The concluding week of September 2016 offered much excitement in New York City, the backdrop for Strata + Hadoop World 2016 and several key IBM announcements, including the launch of a cloud-based, self-service environment for data science teams. Enjoy some key highlights captured from this