An Empirical Evaluation of Intrinsic Dimension Estimators | SpringerLink
Skip to main content

An Empirical Evaluation of Intrinsic Dimension Estimators

  • Conference paper
  • First Online:
Similarity Search and Applications (SISAP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9371))

Included in the following conference series:

  • 1108 Accesses

Abstract

We study the practical behavior of different algorithms that aim to estimate the intrinsic dimension (ID) in metric spaces. Some of these algorithms were specifically developed to evaluate the complexity of searching in metric spaces, based on different theories related to the distribution of distances between objects on such spaces. Others were originally designed for vector spaces only, and have been extended to general metric spaces. To empirically evaluate the fitness of various ID estimations with the actual difficulty of searching in metric spaces, we compare one representative of each of the broadest families of metric indices: those based on pivots and those based on compact partitions. Our preliminary conclusions are that Fastmap and the measure called Intrinsic Dimensionality fit best their purpose.

Partially funded by basal funds FB0001, Conicyt, Chile and Fondecyt grant 1131044, Chile.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Brin, S.: Near neighbor search in large metric spaces. In: Proc. 21st Conf. on Very Large Databases (VLDB 1995), pp. 574–584 (1995)

    Google Scholar 

  2. Camastra, F.: Data dimensionality estimation methods: a survey. Pattern Recognition 36(12), 2945–2954 (2003)

    Article  MATH  Google Scholar 

  3. Camastra, F., Vinciarelli, A.: Estimating the intrinsic dimension of data with a fractal-based method. IEEE TPAMI 24(10), 1404–1407 (2002)

    Article  Google Scholar 

  4. Chávez, E., Marroquín, J.: Proximity queries in metric spaces. In: Proc. 4th South American Workshop on String Processing (WSP 1997), pp. 21–36. Carleton University Press (1997)

    Google Scholar 

  5. Chávez, E., Navarro, G.: A compact space decomposition for effective metric indexing. Pattern Recognition Letters 26(9), 1363–1376 (2005)

    Article  Google Scholar 

  6. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.: Searching in metric spaces. ACM Computing Surveys 33(3), 273–321 (2001)

    Article  Google Scholar 

  7. Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: Proc. 23rd VLDB, pp. 426–435 (1997)

    Google Scholar 

  8. Ciaccia, P., Patella, M., Zezula, P.: A cost model for similarity queries in metric spaces. In: PODS, pp. 59–68 (1998)

    Google Scholar 

  9. Eckmann, J.P., Ruelle, D.: Ergodic theory of chaos and strange attractors. Rev. Mod. Phys. 57, 617 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  10. Faloutsos, C., Lin, K.-I.: Fastmap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Proc. 1995 ACM SIGMOD Intl. Conf. on Management of Data, pp. 163–174. ACM Press (1995)

    Google Scholar 

  11. Figueroa, K., Navarro, G., Chávez, E.: Metric spaces library (2007). http://www.sisap.org/Metric_Space_Library.html

  12. Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Academic Press Professional Inc, San Diego (1990)

    MATH  Google Scholar 

  13. Jagadish, H.V.: A retrieval technique for similar shapes. In: SIGMOD Conference, pp. 208–217. ACM Press (1991)

    Google Scholar 

  14. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc, Upper Saddle River (1988)

    MATH  Google Scholar 

  15. Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer Series in Statistics. Springer (2002)

    Google Scholar 

  16. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)

    MathSciNet  MATH  Google Scholar 

  17. Mandelbrot, B.: Fractals: Form, Chance and Dimension. W. H. Freeman, San Francisco (1977)

    MATH  Google Scholar 

  18. Ott, E.: Chaos in Dynamical Systems. Cambridge University Press, Cambridge (1993)

    MATH  Google Scholar 

  19. Pestov, V.: Intrinsic dimension of a dataset: what properties does one expect? In: Intl. Joint Conf. on Neural Networks (IJCNN), pp. 2959–2964 (2007)

    Google Scholar 

  20. Pestov, V.: An axiomatic approach to intrinsic dimension of a dataset. Neural Networks 21(23), 204–213 (2008). Advances in Neural Networks Research: 2007 International Joint Conference on Neural Networks (IJCNN)

    Google Scholar 

  21. Samet, H.: Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling). Morgan Kaufmann Publishers Inc., San Francisco (2005)

    Google Scholar 

  22. Traina Jr., C., Traina, A.J.M., Faloutsos, C.: Distance exponent: a new concept for selectivity estimation in metric trees. Research Paper 99–110, School of Computer Science, Carnegie Mellon University, 03/1999 (1999)

    Google Scholar 

  23. Yianilos, P.: Excluded middle vantage point forests for nearest neighbor search. In: DIMACS Implementation Challenge, ALENEX 1999, Baltimore, MD (1999)

    Google Scholar 

  24. Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Advances in Database Systems, vol. 32. Springer (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rodrigo Paredes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Bustos, C., Navarro, G., Reyes, N., Paredes, R. (2015). An Empirical Evaluation of Intrinsic Dimension Estimators. In: Amato, G., Connor, R., Falchi, F., Gennaro, C. (eds) Similarity Search and Applications. SISAP 2015. Lecture Notes in Computer Science(), vol 9371. Springer, Cham. https://doi.org/10.1007/978-3-319-25087-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25087-8_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25086-1

  • Online ISBN: 978-3-319-25087-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics