Boosting to Forecast

March 1, 2013 Kelley MacEwen

Summit just held an Econometrics Seminar Program session on boosted regression (boosting), a new data mining technique that has shown considerable success in predictive accuracy. Boosting is a very flexible regression method that, in contrast to many alternative forecasting regression methods, allows us to specify predictor variables without specifying their functional relationships with the response variable that we want to forecast. Boosting has been picking up a lot of steam, with some researchers going so far as to say that “[t]here is mounting empirical evidence that boosting is one of the best modeling approaches ever developed” (Schonlau). While we wouldn’t necessarily go so far as to say that, we agree that boosting is a promising, powerful technique worthy of consideration against other cutting-edge methods to answer some of our clients’ tougher questions.

The first boosting algorithms, known as AdaBoost, were developed by computer scientists Freund and Schapire in 1997. Friedman et al. (2000) reinterpreted the AdaBoost algorithm in a likelihood framework, allowing for the development of boosting algorithms for all common error distributions. Shrinkage, a commonly used variation on Friedman’s boosting algorithm, helps to avoid overfitting. Bagging, another commonly used variation on Friedman’s boosting algorithm, is thought to reduce the variation of the final prediction without affecting bias. Computer Scientists tend to think of boosting as an ensemble method, that is, a method that averages over multiple classifiers, with observations that are repeatedly misclassified being given successively larger weights. And Statisticians tend to think of boosting as a sequential regression method.

Boosting can be a favorable forecasting approach when you have a large dataset, more variables than observations, suspected nonlinearities and/or interactions, ordered categorical predictor variables, or correlated data---That is to say, in a lot of data analytics cases. If you have a small dataset, a set of predictor variables that consists only of indicator variables, or simply a good reason to explicitly specify your model structure and the functional relationships of the variables within the model, then boosting is probably not the way to go. It is also worth mentioning that boosting’s success in terms of predictive accuracy is still subject to much speculation and that the method seems to be especially susceptible to noise.

We’ve just scratched the surface here, but if you’d like more information, we invite you to check out the papers off of which we based our presentation:

  • Freund, Y. and Schapire, R. A Short Introduction to Boosting. Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.
  • Schonlau M. Boosted Regression (Boosting): An introductory tutorial and a Stata plugin. The Stata Journal, 5(3), 330-354.

Share This: