Blogs

Tackling predictive uncertainty with Monte Carlo statistical analysis

Post Comment
Big Data Evangelist, IBM

Predictive uncertainty permeates most decision scenarios. Statistical models have great value when they help you to reduce that uncertainty to the point where you feel comfortable enough to take calculated risks based on likely future outcomes. 

If you’re a statistical analyst and you already know the values of key input variables—such as the cost of materials and the price you can charge for finished products on the open market—predicting target variables, such as the profit you’re likely to earn, are straightforward. This is the case when you’re making predictions based on known, historical data. In those cases, you typically build your predictive model using a statistical technique such as linear regression. 

When, however, you don’t know some or all of your input variables with certainty, linear regression and similar techniques become problematic. And that, in fact, is the situation most statistical analysts face in the real world, where economic conditions cause prices to fluctuate and myriad other risk factors may turn your initial modeling assumptions on their heads. This is the case in many predictive exercises where your input variables must also be predicted as you’re trying to forecast a specific outcome. For those circumstances, you want a predictive technique that can account for uncertainty in the independent variables. And that’s where Monte Carlo simulation techniques come in very handy.

Essentially, Monte Carlo simulations predict an outcome not from the actual values of input data (which aren’t known) but from the likely (aka “simulated”) values of that data (based on their probability distributions). Also known as Monte Carlo “experiments,” this approach involves statistician doing repeated random sampling from inputs’ probability distributions in order to estimate approximate target outcomes. In a typical Monte Carlo experiment, the exercise is repeated thousands or tens of thousands of times to produce a distribution of likely predictive outcomes. These simulations can prove invaluable for assessing risks in many real-world decision scenarios, such as investment appraisals, business and strategic planning, marketing and sales forecasting, and pricing.

The applications of Monte Carlo simulation are many. In business, Monte Carlo methods in finance are often used to calculate the value of companies, to evaluate investments in projects at a business unit or corporate level, or to evaluate financial derivatives. Monte Carlo methods are widely used in engineering for sensitivity analysis and quantitative probabilistic analysis in process design. They are very important in computational physics, physical chemistry, and related applied fields. And they have recently been incorporated in algorithms for playing games that have outperformed previous algorithms in games like Go, Tantrix, Battleship and Havannah. 

Clearly, Monte Carlo simulations can benefit from the data-crunching prowess of today’s most powerful computers. Nevertheless, most of today’s processors can handle the intensive computations from Monte Carlo simulation. Though you can perform these calculations in spreadsheets, it’s best to have a sophisticated statistical software program, such as IBM SPSS Statistics, that has been optimized for Monte Carlo simulations. 

http://www.ibmbigdatahub.com/sites/default/files/montecarlo_embed.jpgRegardless of what tool you’re using, running a Monte Carlo simulation involves three basic steps: 

  • Set up the predictive model, identifying both the dependent variable to be predicted and the independent variables (also known as the input, risk, or predictor variables) that will drive the prediction;
  • Specify probability distributions of the independent variables, using historical data and/or the analyst’s subjective judgment to define a range of likely values and assign probability weights for each; and
  • Run simulations repeatedly, generating random values of the independent variables, until enough results are gathered to make up a representative sample of the near infinite number of possible combinations.

You can run as many Monte Carlo simulations as you wish by modifying the underlying parameters that you use to simulate the data.  

Using the simulation module in SPSS Statistics, you can, for example, simulate various advertising budget amounts and see how that affects total sales. Based on the outcome of the simulation, you might decide to spend more on advertising to meet your total sales goal. With automation, features for saving simulation plans and support for predictive modeling, the simulation module in SPSS Statistics smoothly combines risk analysis and Monte Carlo simulations in one software solution. 

Interested in learning more about Monte Carlo simulation in SPSS Statistics? Check out this informative page and start your free trial today.