Blogs

Extending the power of R to everyone

Product Marketing Manager, IBM

When I am not moonlighting as a product marketing manager at IBM, I can be found studying for a Master of Science in Predictive Analytics. As I traverse the coursework, manipulate data and apply theoretical data mining and analytics concepts to projects, I sometimes find myself in a predicament.

I am by no means a programmer (at least not yet) and, to me, coding is the next best alternative to a root canal. But, IBM SPSS Modeler provides me with a graphical user interface where I can use a canvas and nodes (each representing an action, such as a logistic regression algorithm or a data preparation and cleansing step) to build predictive models in an intuitive stream-like format. This tool is fantastic for taking any type of data, preparing it and then extracting insights and patterns hidden in it for smarter decision making.

And when I am confused as to which algorithm is best suited for my problem, SPSS Modeler can help me pick the best one. Take a look at the following graphic, which shows the application of SPSS Modeler to the task of predicting customer attrition. SPSS Modeler actually tells me which three algorithms best help predict customer churn in my data. 

Back to my travails with analytics, however. There might come a time when a project requires either some additional features or statistical functions. Consider the Kaggle Titanic survivor prediction case study. This case study introduces the use of Random Forest technique, developed by Leo Breiman and Adele Cutler, with the use of Python and R. For those of us, who think “Random Forest” was featured in the Lord of the Rings, it is an ensemble learning method that basically says strength is in numbers: many decision trees that are all different with variables chosen at random at each “branch.” Trust me, it works, and it is amazing.

So, if you either abhor coding or are just starting out, using the technique as described above could be a challenge. But not to worry. IBM fully embraces open source R and provides for its integration with IBM SPSS Modeler. You can take advantage of the joint power of the ever expanding list of libraries and analysis tools of R and the superior data management and user friendly interface of IBM SPSS Modeler.

Enter Predictive Extensions for IBM SPSS Modeler. These extensions, available as a plugin and a quick and easy download, can help with the heavy lifting, enabling you to do what you do best. Simply downloading the extension and adding it to my SPSS Modeler stream enabled me to do an analysis with speed and ease. The following illustration shows how simple it is to build a predictive model with these extensions. 

In today’s world of fast actions and faster results, speed is an advantage that can take you and your organization to the next level. Check out the first set of Predictive Extensions on Analytic Zone and let me know which one is your favorite, and what extension you'd like to see. We are always looking for ideas. And if you want to learn how to make your own, ask and we’ll show you how—you can reach me by leaving a message in the comments below.