Abstract:
Using a number of measures for characterising the complexity of classification problems, we studied the comparative advantages of two methods for constructing decision forests – bootstrapping and random subspaces. We investigated a collection of 392 two-class problems from the UCI depository, and observed that there are strong correlations between the classifier accuracies and measures of length of class boundaries, thickness of the class manifolds, and nonlinearities of decision boundaries. We found characteristics of both difficult and easy cases where combination methods are no better than single classifiers. Also, we observed that the bootstrapping method is better when the training samples are sparse, and the subspace method is better when the classes are compact and the boundaries are smooth.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Received: 03 November 2000, Received in revised form: 25 October 2001, Accepted: 04 January 2002
Rights and permissions
About this article
Cite this article
Ho, T. A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors. Pattern Anal Appl 5, 102–112 (2002). https://doi.org/10.1007/s100440200009
Issue Date:
DOI: https://doi.org/10.1007/s100440200009