A Study on End-Cut Preference in Least Squares Regression Trees

Torgo, Luis

doi:10.1007/3-540-45329-6_14

Luis Torgo²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2258))

Included in the following conference series:

Portuguese Conference on Artificial Intelligence

649 Accesses

Abstract

Regression trees are models developed to deal with multiple regression data analysis problems. These models fit constants to a set of axes-parallel partitions of the input space defined by the predictor variables. These partitions are described by a hierarchy of logical tests on the input variables of the problem. Several authors have remarked that the preference criteria used to select these tests have a clear preference for what is known as end-cut splits. These splits lead to branches with a few training cases, which is usually considered as counter-intuitive by the domain experts. In this paper we describe an empirical study of the effect of this end-cut preference on a large set of regression domains. The results of this study, carried out for the particular case of least squares regression trees, contradict the prior belief that these type of tests should be avoided. As a consequence of these results, we present a new method to handle these tests that we have empirically shown to have better predictive accuracy than the alternatives that are usually considered in tree-based models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Optimal Regression Tree Models through Mixed Integer Programming*

Variable selection for linear regression in large databases: exact methods

Article 18 November 2020

Partitioned least squares

Article Open access 15 July 2024

References

J. Bradford and C. Broadley. The effect of instance-space partition on significance. Machine Learning, 42(3):269–286, 2001.
Article MATH Google Scholar
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Statistics/Probability Series. Wadsworth & Brooks/Cole Advanced Books & Software, 1984.
Google Scholar
J. Catlett. Megainduction: machine learning on very large databases. PhD thesis, Basser Department of Computer Science, University of Sidney, 1991.
Google Scholar
T. Hastie and R. Tibshirani. Generalized Additive Models. Chapman & Hall, 1990.
Google Scholar
J. Morgan and R. Messenger. Thaid: a sequential search program forthe analysis of nominal scale dependent variables. Technical report, Ann Arbor: Institute for Social Research, University of Michigan, 1973.
Google Scholar
J. Morgan and J. Sonquist. Problems in the analysis of survey data, and a proposal. Journal of American Statistics Society, 58:415–434, 1963.
MATH Google Scholar
J. Quinlan. C4.5: programs for machine learning. Kluwer Academic Publishers, 1993.
Google Scholar
L. Torgo. Inductive Learning of Tree-based Regression Models. PhD thesis, Faculty of Sciences, University of Porto, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

LIACC-FEP, University of Porto, R.Campo Alegre, 823, 4150, PORTO, Portugal
Luis Torgo

Authors

Luis Torgo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Economics LIACC, Laboratório de Inteligência Artificial e Ciência de Computadores, University of Porto, Rua do Campo Alegre, 823, 4150-180, Porto, Portugal
Pavel Brazdil & Alípio Jorge &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Torgo, L. (2001). A Study on End-Cut Preference in Least Squares Regression Trees. In: Brazdil, P., Jorge, A. (eds) Progress in Artificial Intelligence. EPIA 2001. Lecture Notes in Computer Science(), vol 2258. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45329-6_14

Download citation

DOI: https://doi.org/10.1007/3-540-45329-6_14
Published: 23 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43030-8
Online ISBN: 978-3-540-45329-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics