Abstract
Random Forests were introduced by Leo Breiman [6] who was inspired by earlier work by Amit and Geman [2]. Although not obvious from the description in [6], Random Forests are an extension of Breiman’s bagging idea [5] and were developed as a competitor to boosting. Random Forests can be used for either a categorical response variable, referred to in [6] as “classification,” or a continuous response, referred to as “regression.” Similarly, the predictor variables can be either categorical or continuous.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amaratunga, D., Cabrera, J., Lee, Y.-S.: Enriched random forests. Bioinformatics 24 (18) pp. 2010–2014 (2008).
Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural Computation 9(7) pp. 1545–1588 (1997).
Biau, G., Devroye, L., Lugosi, G.: Consistency of Random Forests and Other Averaging Classifiers. Journal of Machine Learning Research 9 pp. 2039–2057 (2008).
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, New York (1984).
Breiman, L.: Bagging Predictors. Machine Learning 24 (2) pp. 123–140 (2001).
Breiman, L.: Random Forests. Machine Learning 45 (1) pp. 5–32 (2001).
Chen, X., Liu, C.-T., Zhang, M., Zhang, H.: A forest-based approach to identifying gene and genegene interactions. Proc Natl Acad Sci USA 104 (49) pp. 19199–19203 (2007).
Dettling, M.: BagBoosting for Tumor Classification with Gene Expression Data. Bioinformatics 20 (18) pp. 3583–3593 (2004).
Diaz-Uriarte, R., Alvarez de Andres, S.: Gene Selection and Classification of Microarray Data Using Random Forest. BMC Bioinformatics 7 (1) 3 (2006).
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition. Springer Series in Statistics, Springer, New York (2009).
Goldstein, B., Hubbard, A., Cutler, A. Barcellos, L.: An application of Random Forests to a genome-wide association dataset: Methodological considerations & new findings. BMC Genetics 11 (1) 49 (2010).
Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A., Van Der Laan, M.: Survival Ensembles. Biostatistics 7 (3) pp. 355–373 (2006).
Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S.: Random survival forests. Annals of Applied Statistics 2 (3) pp. 841–860 (2008).
Izenman, A.: Modern Multivariate Statistical Techniques. Springer Texts in Statistics, Springer, New York (2008).
Liaw, A., Wiener, M.: Classification and Regression by randomForest. R News 2 (3) pp. 18–22 (2002).
Lin, Y., Jeon, Y.: Random Forests and Adaptive Nearest Neighbors. Journal of the American Statistical Association 101 (474) pp. 578–590 (2006).
Mease, D., Wyner, A.: Evidence Contrary to the Statistical View of Boosting. Journal of Machine Learning Research 9 pp. 131–156 (2008).
Meinshausen, N.: Quantile Regression Forests. Journal of Machine Learning Research 7 pp. 983–999 (2006).
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2011). http://www.R-project.org.
Schroff, F., Criminisi, A., Zisserman, A.: Object Class Segmentation using Random Forests. Proceedings of the British Machine Vision Conference 2008, British Machine Vision Association, 1 (2008).
Segal, M., Xiao, Y.: Multivariate Random Forests. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1 (1) pp. 80–87 (2011).
Singh D., Febbo P.G., Ross K., Jackson D.G., Manola J., Ladd C., Tamayo P., Renshaw A.A., D’Amico A.V., Richie J.P., Lander E.S., Loda M., Kantoff P.W., Golub T.R., Sellers W.R.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1 (2) pp. 203–209 (2002).
Stamey, T., Kabalin, J., McNeal J., Johnstone I., Freiha F., Redwine E., Yang N.: Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate. II. Radical prostatectomy treated patients. Journal of Urology 16 pp. 1076–1083 (1989).
Statnikov, A., Wang, L., Aliferis, C.: A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinformatics 9 (1) 319 (2008).
Wang, M., Chen, X., Zhang, H.: Maximal conditional chi-square importance in random forests. 26 (6): pp. 831–837 (2010).
Zhang, H., Singer, B.H.: Recursive Partitioning and Applications, Second Edition. Springer Series in Statistics, Springer, New York (2010).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Cutler, A., Cutler, D.R., Stevens, J.R. (2012). Random Forests. In: Zhang, C., Ma, Y. (eds) Ensemble Machine Learning. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9326-7_5
Download citation
DOI: https://doi.org/10.1007/978-1-4419-9326-7_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-9325-0
Online ISBN: 978-1-4419-9326-7
eBook Packages: EngineeringEngineering (R0)