Methods Inf Med 2010; 49(03): 219-229
DOI: 10.3414/ME0543
Original Articles
Schattauer GmbH

An Experimental Evaluation of Boosting Methods for Classification

R. Stollhoff
1   Max-Planck Institute for Mathematics in the Sciences, Leipzig, Germany
,
W. Sauerbrei
2   Institute of Medical Biometry and Medical Informatics, University Medical Center, Freiburg, Germany
,
M. Schumacher
2   Institute of Medical Biometry and Medical Informatics, University Medical Center, Freiburg, Germany
› Author Affiliations
Further Information

Publication History

received: 11 February 2008

accepted: 07 February 2009

Publication Date:
17 January 2018 (online)

Summary

Objectives: In clinical medicine, the accuracy achieved by classification rules is often not sufficient to justify their use in daily practice. In order to improve classifiers it has become popular to combine single classification rules into a classification ensemble. Two popular boosting methods will be compared with classical statistical approaches.

Methods: Using data from a clinical study on the diagnosis of breast tumors and by simulation we will compare AdaBoost with gradient boosting ensembles of regression trees. We will also consider a tree approach and logistic regression as traditional competitors. In logistic regression we allow to select nonlinear effects by the fractional polynomial approach. Performance of the classifiers will be assessed by estimated misclassification rates and the Brier score.

Results: We will show that boosting of simple base classifiers gives classification rules with improved predictive ability. However, the performance of boosting classifiers was not generally superior to the performance of logistic regression. In contrast to the computer-intensive methods the latter are based on classifiers which are much easier to interpret and to use.

Conclusions: In medical applications, the logistic regression model remains a method of choice or, at least, a serious competitor of more sophisticated techniques. Refinement of boosting methods by using optimized number of boosting steps may lead to further improvement.

 
  • References

  • 1 Schwarzer G, Nagata T, Mattern D, Schmelzeisen R, Schumacher M. Comparison of fuzzy inference, logistic regression, and classification trees (CART) for the prediction of cervical lymph node metastasis in carcinoma of the tongue. Methods Inf Med 2003; 42 (05) 572-577.
  • 2 Campos LF, Silva AC, Barros AK. Independent component analysis and neural networks applied for classification of malignant, benign and normal tissue in digital mammography. Methods Inf Med 2007; 46 (02) 212-215.
  • 3 Tjortjis C, Saraee M, Theodoulidis B, Keane JA. Using T3, an improved decision tree classifier, for mining stroke-related medical data. Methods Inf Med 2007; 46 (05) 523-529.
  • 4 Verduijn M, Peek N, Voorbraak V, de Jonge E, de Mol BA. Modeling length of stay as an optimized two-class prediction problem. Methods Inf Med 2007; 46 (03) 352-359.
  • 5 Adler W, Peters A, Lausen B. Comparison of classifiers applied to confocal scanning laser ophthalmoscopy data. Methods Inf Med 2008; 47: 38-46.
  • 6 Linder R, König IR, Weimar C, Diener HC, Pöppl SJ, Ziegler A. Two models for outcome prediction – a comparison of logistic regression and neural networks. Methods Inf Med 2006; 45 (05) 536-540.
  • 7 Sakai S, Kobayashi K, Nakamura J, Toyabe S, Akazawa K. Accuracy in the diagnostic prediction of acute appendicitis based on the Bayesian network model. Methods Inf Med 2007; 46 (06) 723-726.
  • 8 Bühlmann P, Yu B. Boosting with the L2-Loss. Regression and Classification. Journal of the American Statistical Association 2003; 98: 324-339.
  • 9 Sauerbrei W, Madjar H, Prömpeler HJ. Use of logistic regression and a classification tree approach for the development of diagnostic rules: Differentiation of benign and malignant breast tumors based on color doppler flow signals. Methods Inf Med 1998; 37: 226-234.
  • 10 Dietterich T. An experimental comparison of three methods for constructing ensembles of decision trees. Machine Learning 2000; 40: 139-157.
  • 11 Freund Y. Boosting a weak learning algorithm by majority. Information and Computation 1995; 121 (02) 256-285.
  • 12 Breiman L. Bagging Predictors. Machine Learning 1996; 26 (02) 123-140.
  • 13 Breiman L. Random forests. Machine Learning 2001; 45 (01) 5-32.
  • 14 Dietterich TG. Ensemble learning. In: The Handbook of Brain Theory and Neural Networks. Second Edition. Cambridge:: MIT Press, MA; 2002
  • 15 Freund Y, Schapire R. Experiments with a new boosting algorithm. Machine Learning. Proceedings of the Thirteenth International Conference; 1996 pp 148-156.
  • 16 Breiman L. Arcing Classifiers. Annals of Statistics 1998; 26 (03) 801-849.
  • 17 Friedman J, Hastie T, Tibshirani R. Additive logistic regression: A statistical view of boosting. Annals of Statistics 2000; 2: 334-374.
  • 18 Hastie T, Tibshirani R. Generalized Additive Models. London: Chapman & Hall; 1990
  • 19 Friedman J. Greedy function approximation: a gradient boosting machine. Technical Report. Department of Statistics, Stanford University; 1999
  • 20 Marshall RJ. Comparison of misclassification rates of search partition analysis and other classification methods. Statistics in Medicine 2006; 25: 3787-3797.
  • 21 Wyatt JC, Altman DG. Prognostic models: clinically useful or quickly forgotten?. British Medical Journal 1995; 311: 1539-1541.
  • 22 Terrin N, Schmid CH, Griffith JL, D’Agostino RB. Sr, Selker HP External validity of predicitve models: A comparison of logistic regression, classification trees, and neural networks. Journal of Clinical Epidemiology 2003; 56: 721-729.
  • 23 Royston P, Altman D. Regression using fractional polynomials of continuous covariates: Parsimonious parametric modelling (with discussion). Applied Statistics 1994; 43 (03) 429-467.
  • 24 Sauerbrei W, Royston P. Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. Journal of the Royal Statistical Society, Series A 1999; 162: 71-94.
  • 25 Sauerbrei W, Royston P. Corrigendum to: Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. Journal of the Royal Statistical Society, Series A 2002; 165: 399-400.
  • 26 Breiman L, Friedman J, Olshen R, Stone C. Classification and Regression Trees. Wadsworth, CA:: 1984
  • 27 Wehberg S, Schumacher M. A comparison of non-parametric error rate estimation methods in classification problems. Biometrical J 2004; 46: 35-47.
  • 28 Hand D. Measuring diagnostic accuracy of statistical prediction rules. Statistica Neerlandica 2000; 53: 3-16.
  • 29 Hand D. Construction and Assessment of Classification Rules. N. Y.: John Wiley; 1997
  • 30 Gerds TA, Cai T, Schumacher M. The performance of risk prediction models. Biometrical Journal 2008; 50: 457-479.
  • 31 Stollhoff R. Verbesserung von Klassifikationsverfahren durch Boosting bei binärer Zielgröße. (Improvement of classification approaches for a binary outcome by using boosting.) Diploma thesis, in German. Department of Mathematics, Albert-Ludwigs-Universität Freiburg i. Br.; 2004
  • 32 Ihaka R, Gentlemen R. R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 1996; 5: 299-314.
  • 33 Ambler G, Benner A. Software R contributed package: Multivariate fractional polynomials. 2006 ( http://cran.r-project.org ).
  • 34 Therneau TM, Atkinson B. Software R Contributed Package: Recursive Partitioning. 2002 ( http://cran.r-project.org ).
  • 35 Ridgeway G. Software R Contributed Package: Generalized Boosted Regression Models. 2006 ( http://cran.r-project.org ).
  • 36 Efron B. How biased is the apparent error rate of a prediction rule?. Journal of the American Statistical Association 1986; 394 (01) 461-470.
  • 37 Molinaro A, Simon R, Pfeiffer R. Prediction error estimation: a comparison of resampling methods. Bioinformatics 2005; 21 (15) 3301-3307.
  • 38 Breiman L. Some infinity theory for predictor ensembles. Technical Report 577. Statistics Department, University of California at Berkeley; 2000
  • 39 Drucker H. Effect of pruning and early stopping on performance of a boosting ensemble. Computational Statistics & Data Analysis 2002; 38: 393-406.
  • 40 Hothorn T, Lausen B. Bundling classifiers by bundling trees. Computational Statistics & Data Analysis 2005; 49: 1068-1078.
  • 41 Jiang W. Process Consitency for AdaBoost. Annals of Statistics 2004; 32 (01) 13-29.
  • 42 Lugosi G, Vayatis N. On the Bayes-risk consistency of regularized boosting methods. Annals of Statistics 2004; 32 (01) 30-55.
  • 43 Hamza M, Laroque D. An empirical comparison of ensemble methods based on classification trees. Journal of Statistical Computation and Simulation 2005; 78 (08) 629-643.
  • 44 Dettling M. BagBoosting for tumor classification with gene expression data. Bioinformatics 2004; 20 (18) 3583-3593.
  • 45 Friedman J. Stochastic gradient boosting. Computational Statistics & Data Analysis 2002; 38: 367-378.
  • 46 Hand D. Classifier technology and the illusion of progress. Statistical Science 2006; 21: 1-14 (disc.: 15-34).