Abstract
We studied several measures of the complexity of classification problems and related them to the comparative advantages of two methods for creating multiple classifier systems. Using decision trees as prototypical classifiers and bootstrapping and subspace projection as classifier generation methods, we studied a collection of 437 two-class problems from public databases. We observed strong correlations between classifier accuracies, a measure of class boundary length, and a measure of class manifold thickness. Also, the bootstrapping method appears to be better when subsamples yield more variable boundary measures and the subspace method excels when many features contribute evenly to the discrimination.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Basu, M., Ho, T.K., The learning behavior of single neuron classifiers on linearly separable or nonseparable input, Proc. of the 1999 Int’l Joint Conf. on Neural Networks, Washington, DC, July 1999.
Berlind, R., An Alternative Method of Stochastic Discrimination with Applications to Pattern Recognition, Doctoral Dissertation, Dept. of Mathematics, SUNY at Buffalo, 1994.
Breiman, L., Bagging predictors, Machine Learning, 24, 1996, 123–140.
Devroye, L., Any discrimination rule can have an arbitrarily bad probability of error for finite sample size, IEEE Trans. on PAMI, 4 2, March 1982, 154–157.
Friedman, J.H., Rafsky, L.C., Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests, Annals of Statistics, 7 4, 1979, 697–717.
Fukunaga, K., Kessell, D.L., Estimation of classification error, IEEE Trans. on Computers, 20,12, December 1971, 1521–1527.
Ho, T.K., Random decision forests, Proc. of the 3rd Int’l Conf. on Document Analysis and Recognition, Montreal, August 14–18, 1995, 278–282.
Ho, T.K., The random subspace method for constructing decision forests, IEEE Trans. on PAMI, 20, 8, August 1998, 832–844.
Ho, T.K., Baird, H.S., Large-scale simulation studies in image pattern recognition, IEEE Trans. on PAMI, 19, 10, October 1997, 1067–1079.
Ho, T.K., Baird, H.S., Pattern classification with compact distribution maps, Computer Vision and Image Understanding, 70, 1, April 1998, 101–110.
Kittler, J., Devijver, P.A., Statistical properties of error estimators in performance assessment of recognition systems, IEEE Trans. on PAMI, 4, 2, March 1982, 215–220.
Kleinberg, E.M., An overtraining-resistant stochastic modeling method for pattern recognition, Annals of Statistics, 4, 6, December 1996, 2319–2349.
Lebourgeois, F., Emptoz, H., Pretopological approach for supervised learning, Proc. of the 13th Int’l Conf. on Pattern Recognition, Vienna, 1996, 256–260.
Li, M., Vitanyi, P., An Introduction to Kolmogorov Complexity and Its Applications Springer-Verlag, 1993.
Maciejowski, J.M., Model discrimination using an algorithmic information criterion, Automatica, 15, 1979, 579–593.
Murthy, S., Kasif, S., Salzberg, S., A System for Induction of Oblique Decision Trees, J. of Artificial Intelligence Research, 2, 1, 1994, 1–32.
Raudys, S., On dimensionality, sample size, and classification error of nonparametric linear classification algorithms, IEEE Trans. on PAMI, 19, 6, June 1997, 667–671.
Raudys, S., Jain, A.K., Small sample size effects in statistical pattern recognition: Recommendations for practitioners, IEEE Trans. on PAMI, 13, 3, 1991, 252–264.
Raudys, S., Pikelis, V., On dimensionality, sample size, classification error, and complexity of classification algorithm in pattern recognition, IEEE Trans. on PAMI, 2, 3, May 1980, 242–252.
Sohn, S.Y., Meta analysis of classification algorithms for pattern recognition, IEEE Trans. on PAMI, 21, 11, 1999, 1137–1144.
Toussaint, G.T., Bibliography on estimation of misclassification, IEEE Trans. on Information Theory, 20, 4, July 1974, 472–479.
Vapnik, V., Statistical Learning Theory, John Wiley & Sons, 1998.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ho, T.K. (2000). Complexity of Classification Problems and Comparative Advantages of Combined Classifiers. In: Multiple Classifier Systems. MCS 2000. Lecture Notes in Computer Science, vol 1857. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45014-9_9
Download citation
DOI: https://doi.org/10.1007/3-540-45014-9_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67704-8
Online ISBN: 978-3-540-45014-6
eBook Packages: Springer Book Archive