Abstract
The ordinal forest method is a random forest–based prediction method for ordinal response variables. Ordinal forests allow prediction using both low-dimensional and high-dimensional covariate data and can additionally be used to rank covariates with respect to their importance for prediction. An extensive comparison study reveals that ordinal forests tend to outperform competitors in terms of prediction performance. Moreover, it is seen that the covariate importance measure currently used by ordinal forest discriminates influential covariates from noise covariates at least similarly well as the measures used by competitors. Several further important properties of the ordinal forest algorithm are studied in additional investigations. The rationale underlying ordinal forests of using optimized score values in place of the class values of the ordinal response variable is in principle applicable to any regression method beyond random forests for continuous outcome that is considered in the ordinal forest method.
Similar content being viewed by others
References
Ben-David, A. (2008). Comparison of classification accuracy using Cohen’s weighted Kappa. Expert Systems with Applications, 34(2), 825–832.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Breiman, L., Friedman, J.H., Olshen, R.A., Ston, C.J. (1984). Classification and regression trees. Monterey: Wadsworth International Group.
Cohen, J. (1960). A Coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.
Cohen, J. (1968). Weighed Kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220.
Hornung, R. (2018). ordinalForest: Ordinal Forests: Prediction and Variable Ranking with Ordinal Target Variables, R package version 2.2.
Hothorn, T., Hornik, K., Zeileis, A. (2006). Unbiased recursive partitioning: a conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651–674.
Jakobsson, U., & Westergren, A. (2005). Statistical methods for assessing agreement for ordinal data. Scandinavian Journal of Caring Sciences, 19(4), 427–431.
Janitza, S., Tutz, G., Boulesteix, A.L. (2016). Random forest for ordinal responses: prediction and variable selection. Computational Statistics and Data Analysis, 96, 57–73.
McCullagh, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society Series B, 42(2), 109–142.
Probst, P., Bischl, B., Boulesteix, A.L. (2018). Tunability: importance of hyperparameters of machine learning algorithms. arXiv:1802.09596.
Wright, M.N., & Ziegler, A. (2017). ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software, 77 (1), 1–17.
Acknowledgments
The author thanks Giuseppe Casalicchio for proofreading and comments and Jenny Lee for language corrections. This work was supported by the German Science Foundation (DFG-Einzelförderung BO3139/6-1 to Anne-Laure Boulesteix).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Hornung, R. Ordinal Forests. J Classif 37, 4–17 (2020). https://doi.org/10.1007/s00357-018-9302-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-018-9302-x