Blogs

Accessing the power of R through a robust statistical analysis tool

Post Comment
Big Data Evangelist, IBM

Open source tools continue to foster non-stop innovation throughout the Insight Economy. So it’s no surprise that open-source languages—most notably, R—have moved to the center of enterprise statistical analytics and data management.

R is a pillar of statistical processing in the new era of enterprise data science. R has proven itself over many years of rising commercial adoption as a high-quality tool for statistical analysis, predictive analytics, data-driven planning and forecasting, and data visualization. 

http://www.ibmbigdatahub.com/sites/default/files/powerofr_embed.jpgFor today’s professional statistical analysts, R possesses the following core strengths: 

  • Open: R’s open-source provenance ensures that whenever a new analytical approach is developed, it is released to the entire R community almost immediately once it has been submitted and tested through the R project.
  • Mature: R has been on the market for more than two decades, is field-proven in many enterprise applications, and is an integral component of many commercial solutions such as IBM SPSS Statistics.
  • Popular: R has become very popular with statisticians and data miners for developing statistical software and is widely used for advanced data analysis.
  • Comprehensive: R provides a wide variety of advanced statistical and graphical techniques.
  • Inexpensive: R is available as Free Software under the terms of the Free Software Foundation GNU General Public License.
  • Deployable: R runs on Windows and MacOS, a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux).
  • Intuitive: R is object-oriented language that is easy for the average statistical analyst to learn and master.
  • Composable: R facilitates development and composition of statistical models as components of larger analytic workflows.
  • Embeddable: R is easier to embed into applications than other statistical languages.
  • Extensible: R packages can be easily extended in the statistical analysis tool of your choice.

Though open-source R offerings on the market are increasingly robust, there are compelling reasons why users turn to production-ready commercial analytics solutions--such as IBM SPSS Statistics—for their development in R. For established R programmers the principal advantages of adopting commercially supported tools are several:

  • Leverage a field-proven statistical analysis tool that has been in-market for decades and been used for many types of high-performance statistical analysis in every type of business, as well as academia, science, and government;
  • Program advanced analytics using common statistical programming languages in the market, including but not limited to R;
  • Develop R statistical analyses and advanced visualizations on large and complex datasets using advanced algorithms and procedures in an integrated commercial platform;
  • Process and deploy R analytic models faster with flexible deployment options;
  • Access the commercial platform’s rich data-management capabilities to handle much larger data sets that traditional R open-source tools;
  • Use the platform’s tailored functionality and customizable interfaces for different skill levels and functional responsibilities;
  • Create high-resolution graphs and presentation-ready reports to easily communicate results of analyses programmed in R;
  • Leverage the commercial platform’s richer set of graphical and pivot table output options, which can lead to a better user experience; and
  • Use the commercial platform as deployment vehicle to distribute R packages to a wide range of users. 

For longtime SPSS Statistics users, the advantages of adding no-charge R support to their existing licensed tools are as follows: 

  •          Tap into the R community’s rich, ever-expanding collection of R statistical analysis and graphing libraries
  •          Access nearly 4,000 open-source R statistical functions
  •          Use R functions that are not available within SPSS Statistics without having to learn R 

If you’ve licensed SPSS Statistics, you can take advantage of R by installing a no-charge plug-in into the tool.  The plug-in—part of a family of no-charge add-ins for Python, Java and .NET—enables SPSS developers to take advantage of R extensions written either by themselves or others. The plug-in—available for Windows, Linux, Mac OS and SPSS Statistics Server—has the following features: 

  • Enables R programs to communicate with SPSS Statistics by means of APIs;
  • Extends the SPSS Statistics command syntax language with the full capabilities of the R programming language;
  • Provides access to an R integrated development environment so that users to develop, test, and debug R programs for use with SPSS Statistics;
  • Enables users to take advantage of the R programs that others have written, packaged, and shared as extension bundles, which are pre-coded algorithms that eliminate the need for in-depth R programming;
  • Enables users to write their own shareable R programs and integrate them in SPSS Statistics at various levels by creating custom dialogues that generate the syntax for an R extension command or explicit R code, or creating an extension command implemented in R, or even running R code directly from within SPSS Statistics;
  • Allows users to wrap R and Python code in SPSS syntax through extension commands and thereby use them in all jobs without programming; and
  • Lets users turn any R procedure from on Github or anywhere in the open source community into an extension 

Hungry for more information on R development in SPSS Statistics? Check out this informative page and start your free trial today. 

Users should also visit the SPSS Community to view and download from an extensive library of useful extensions, as well as upload their own extensions for others to use. The community also has a very active forum where users can ask questions and get answers on issues related to R programming within SPSS Statistics.