Search Data Structures for Skewed Strings | SpringerLink
Skip to main content

Search Data Structures for Skewed Strings

  • Conference paper
  • First Online:
Experimental and Efficient Algorithms (WEA 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2647))

Included in the following conference series:

  • 608 Accesses

Abstract

We consider skewed distributions of strings, in which any two such strings share a common prefix much longer than that expected in uniformly distributed (random) strings. For instance, this is the case of URL addresses, IP addresses, or XML path strings, all representing paths in some hierarchical order. As strings sharing a portion of the path have a quite long common prefix, we need to avoid the time-consuming repeated examination of these common prefixes while handling the linked data structures storing them. For this purpose, we show how to implement search data structures that can operate on strings with long prefixes in common. Despite the simplicity and the generality of the method, our experimental study shows that it is quite competitive with several optimized and tuned implementations currently available in the literature.

Partially supported by the Italian MIUR Project ALINWEB.

Partially supported by the EU Programme ALCOM-FT (n. IST-1999-14186) and by the Italian MIUR Project ALINWEB.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. A. Acharya, H. Zhu, K. Shen. Adaptive algorithms for cache-efficient trie search. Proc. ALENEX 99. Source code in http://www.cs.rochester.edu/~kshen, 1999.

  2. G. M. Adel’son-Vel’skii and E. M. Landis. An algorithm for the organization of information. Soviet Mathematics Doklady, 3 (1962), 1259–1263.

    Google Scholar 

  3. Arianna Search Engine, http://arianna.libero.it.

  4. R. Bayer. Symmetric binary B-trees: Data structure and maintenance algorithms. Acta Informatica 1 (1972), 290–306.

    Article  MATH  MathSciNet  Google Scholar 

  5. J. L. Bentley and R. Sedgewick. Fast algorithms for sorting and searching strings. SODA 1997, 360–369 (http://www.cs.princeton.edu/~rs/strings).

  6. S. W. Bent, D. D. Sleator and R. E. Tarjan. Biased search trees. SIAM Journal on Computing 14 (1985), 545–568.

    Article  MATH  MathSciNet  Google Scholar 

  7. F. Bonchi, F. Giannotti, C. Gozzi, G. Manco, M. Nanni, D. Pedreschi, C. Renso and S. Ruggieri, Web log data warehousing and mining for intelligent web caching. Data and Knowledge Engineering, to appear.

    Google Scholar 

  8. H. A. Clampett. Randomized binary searching with the tree structures. Communications of the ACM 7 (1964), 163–165.

    Article  MATH  Google Scholar 

  9. J. Clément, Ph. Flajolet and B. Vallée. The analysis of hybrid trie structures. SODA 1999.

    Google Scholar 

  10. J. Feigenbaum and R. E. Tarjan. Two new kinds of biased search trees. Bell Systems Technical Journal 62 (1983), 3139–3158.

    MATH  MathSciNet  Google Scholar 

  11. T. F. Gonzalez. The on-line d-dimensional dictionary problem. SODA 1992, 376–385.

    Google Scholar 

  12. R. Grossi, G. F. Italiano. Efficient techniques for maintaining multidimensional keys in linked data structures. ICALP 1999, 372–381.

    Google Scholar 

  13. R. H. Gueting and H.-P. Kriegel. Multidimensional B-tree: An efficient dynamic file structure for exact match queries. 10th GI Annual Conference, 375–388.

    Google Scholar 

  14. L. J. Guibas and R. Sedgewick. A dichromatic framework for balanced trees. FOCS 1978, 8–21.

    Google Scholar 

  15. D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge University Press, 1997.

    Google Scholar 

  16. S. Huddleston and K. Mehlhorn. A new data structure for representing sorted lists. Acta Informatica 17 (1982), 157–184.

    Article  MATH  MathSciNet  Google Scholar 

  17. D. E. Knuth. The Art of computer programming, Vol. 3: Sorting and Searching. Second edition, Addison-Wesley, 1973, 1998.

    Google Scholar 

  18. R. Jenkins, http://burtleburtle.net/bob/hash/.

  19. K. Mehlhorn. Dynamic binary search. SIAM J. on Computing 8 (1979), 175–198.

    Article  MATH  MathSciNet  Google Scholar 

  20. K. Mehlhorn. Data structures and algorithms:Searching and sorting, Springer 1984.

    Google Scholar 

  21. D. R. Morrison. PATRICIA — Practical Algorithm To Retrieve Information Coded In Alphanumeric. J. ACM, 15, 514–534, 1968.

    Article  Google Scholar 

  22. J. Nievergelt and E. M. Reingold. Binary search trees of bounded balance. SIAM Journal on Computing 2 (1973), 33–43.

    Article  MATH  MathSciNet  Google Scholar 

  23. J. M. Ockerbloom. The on-line books page. http://digital.library.upenn.edu/books/

  24. W. Pugh. Skip Lists: A probabilistic alternative to balanced trees. Communications of the ACM 33 (1990), 668–676.

    Article  Google Scholar 

  25. W. Pugh. Skip Lists ftp directory. ftp://ftp.cs.umd.edu/pub/skipLists/.

  26. R. Sedgewick. Algorithms in C, Addison-Wesley, 1998.

    Google Scholar 

  27. R. Seidel and C. R. Aragon. Randomized search trees. Algorithmica 16 (1996), 464–497.

    Article  MATH  MathSciNet  Google Scholar 

  28. C. Silverstein. Library call data set. http://theory.stanford.edu/~csilvers/libdata.

  29. C. Silverstein. A practical perfect hash algorithm. Manuscript, 1998.

    Google Scholar 

  30. D. D. Sleator and R. E. Tarjan. Self-adjusting binary search trees. Journal of the ACM 32 (1985), 652–686.

    Article  MATH  MathSciNet  Google Scholar 

  31. W. Szpankowski, Average Case Analysis of Algorithms on Sequences, Wiley, 2001.

    Google Scholar 

  32. R. E. Tarjan. Data structures and network algorithms, SIAM (1983).

    Google Scholar 

  33. V. K. Vaishnavi. Multidimensional height-balanced trees. IEEE Transactions on Computers C-33 (1984), 334–343.

    Article  Google Scholar 

  34. V. K. Vaishnavi. Multidimensional balanced binary trees. IEEE Transactions on Computers C-38 (1989), 968–985.

    Article  MathSciNet  Google Scholar 

  35. V. K. Vaishnavi. On k-dimensional balanced binary trees. Journal of Computer and System Sciences 52 (1996), 328–348.

    Article  MathSciNet  Google Scholar 

  36. Mark A. Weiss. Data structures and algorithm analysis in C, Addison Wesley, (1997). Source code in http://www.cs.fiu.edu/~weiss/dsaa_c2e/files.html.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Crescenzi, P., Grossi, R., Italiano, G.F. (2003). Search Data Structures for Skewed Strings. In: Jansen, K., Margraf, M., Mastrolilli, M., Rolim, J.D.P. (eds) Experimental and Efficient Algorithms. WEA 2003. Lecture Notes in Computer Science, vol 2647. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44867-5_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-44867-5_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40205-3

  • Online ISBN: 978-3-540-44867-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics