Dimensionality Reduction through Sub-space Mapping for Nearest Neighbour Algorithms

Payne, Terry R.; Edwards, Peter

doi:10.1007/3-540-45164-1_35

Terry R. Payne⁴ &
Peter Edwards⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1810))

Included in the following conference series:

European Conference on Machine Learning

Abstract

Many learning algorithms make an implicit assumption that all the attributes present in the data are relevant to a learning task. However, several studies have demonstrated that this assumption rarely holds; for many supervised learning algorithms, the inclusion of irrelevant or redundant attributes can result in a degradation in classification accuracy. While a variety of different methods for dimensionality reduction exist, many of these are only appropriate for datasets which contain a small number of attributes (e.g. < 20). This paper presents an alternative approach to dimensionality reduction, and demonstrates how it can be combined with a Nearest Neighbour learning algorithm. We present an empirical evaluation of this approach, and contrast its performance with two related techniques; a Monte-Carlo wrapper and an Information Gain-based filter approach.

Download to read the full chapter text

Chapter PDF

Nearest Neighbor Algorithms

A Cascaded Method to Reduce the Computational Burden of Nearest Neighbor Classifier

Hybridization of feature selection and feature weighting for high dimensional data

Article 20 November 2018

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

D. W. Aha. Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms. International Journal of Man-Machine Studies, 36:267–287, 1992. 332
Article Google Scholar
H. Almuallim and T. G. Dietterich. Learning With Many Irrelevant Features. In Proceedings of the 9th National Conference on Artificial Intelligence (AAAI-91), pages 547–552. MIT Press, 1991. 336
Google Scholar
M. W. Berry and R. D. Fierro. Low-Rank Orthogonal Decompositions for Information Retrieval Applications. Numerical Linear Algebra with Applications, 1(1):1–27, 1996. 334
Google Scholar
C. L. Blake and C. J. Merz. UCI repository of machine learning databases, 1998. 333, 336
Google Scholar
C. Cardie. Using Decision Trees to Improve Case-Based Learning. In Proceedings of the 10th International Conference on Machine Learning, pages 25–32. San Francisco, CA: Morgan Kaufmann, 1993. 336
Google Scholar
B. V. Dasarathy. Nearest Neighbor(NN) Norms: NN Pattern Classification Techniques. Los Alamitos, California: IEEE Computer Society Press, 1991. 335
Google Scholar
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41(6):391–407, 1990. 333, 341
Article Google Scholar
R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons, 1973. 337
Google Scholar
M. J. Greenacre. Theory and Applications of Correspondence Analysis. London, UK: Academic Press, 1984. 332, 333, 334
MATH Google Scholar
G. John, R. Kohavi, and K. Pfleger. Irrelevant Features and the Subset Selection Problem. In Proceedings of the 11th International Conference on Machine Learning, pages 121–129. San Francisco, CA: Morgan Kaufmann, 1994. 332
Google Scholar
P. Langley and W. Iba. Average-case Analysis of a Nearest Neighbor Algorithm. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI), pages 889–894. San Mateo, CA: Morgan Kaufmann, 1993. 332
Google Scholar
P. Langley and S. Sage. Induction of Selective Bayesian Classifiers. In Proceedings of the 10th Conference on Uncertainty in Artificial Intelligence, pages 399–406. Seattle, WA: Morgan Kaufmann, 1994. 332
Google Scholar
H. Liu and R. Setiono. A Probabilistic Approach to Feature Selection — A Filter Solution. In Proceedings of the 13th International Conference on Machine Learning, pages 319–327. San Francisco, CA: Morgan Kaufmann, 1996. 335, 336
Google Scholar
T. R. Payne. Dimensionality Reduction and Representation for Nearest Neighbour Learning. PhD thesis, The University of Aberdeen, Scotland, 1999. 332, 333, 334, 335
Google Scholar
T. R. Payne and P. Edwards. Dimensionality Reduction through Correspondence Analysis. Technical Report AUCS/TR9910, Department of Computing Science, University of Aberdeen, Scotland., 1999. 333, 334
Google Scholar
W. H. Press. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, 1992. 334
Google Scholar
J. R. Quinlan. C4.5 Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann, 1993. 336
Google Scholar
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. New York: McGraw-Hill, 1983. 333
MATH Google Scholar
C. Wu, M. Berry, S. Shivakumar, and J. McLarty. Neural Networks for Full-Scale Protein Sequence Classification: Sequence Encoding with Singular Value Decomposition. Machine Learning, 21:177–193, 1995. 333
Google Scholar

Download references

Author information

Authors and Affiliations

The Robotics Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15232, USA
Terry R. Payne
Department of Computing Science, King’s College, University of Aberdeen, Aberdeen, Scotland, AB24 3UE
Peter Edwards

Authors

Terry R. Payne
View author publications
You can also search for this author in PubMed Google Scholar
Peter Edwards
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut d’Investigació en Intelligència Artificial, IIIA, Spanish Council for Scientific Research, CSIC, Campus, U.A.B., 08193, Bellaterra, Catalonia, Spain
Ramon López de Mántaras & Enric Plaza &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Payne, T.R., Edwards, P. (2000). Dimensionality Reduction through Sub-space Mapping for Nearest Neighbour Algorithms. In: López de Mántaras, R., Plaza, E. (eds) Machine Learning: ECML 2000. ECML 2000. Lecture Notes in Computer Science(), vol 1810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45164-1_35

Download citation

DOI: https://doi.org/10.1007/3-540-45164-1_35
Published: 14 January 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67602-7
Online ISBN: 978-3-540-45164-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Dimensionality Reduction through Sub-space Mapping for Nearest Neighbour Algorithms

Abstract

Chapter PDF

Similar content being viewed by others

Nearest Neighbor Algorithms

A Cascaded Method to Reduce the Computational Burden of Nearest Neighbor Classifier

Hybridization of feature selection and feature weighting for high dimensional data

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Dimensionality Reduction through Sub-space Mapping for Nearest Neighbour Algorithms

Abstract

Chapter PDF

Similar content being viewed by others

Nearest Neighbor Algorithms

A Cascaded Method to Reduce the Computational Burden of Nearest Neighbor Classifier

Hybridization of feature selection and feature weighting for high dimensional data

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation