Text Indexing | SpringerLink
Skip to main content

Text Indexing

  • Reference work entry
  • First Online:
Encyclopedia of Algorithms

Years and Authors of Summarized Original Work

  • 1993; Manber, Myers

Problem Definition

Text or string data naturally arises in many contexts including document processing, information retrieval, natural and computer language processing, and describing molecular sequences. In broad terms, the goal of text indexing is to design methodologies to store text data so as to significantly improve the speed and performance of answering queries. While text indexing has been studied for a long time, it shot into prominence during the last decade due to the ubiquity of web-based textual data and search engines to explore it, design of digital libraries for archiving human knowledge, and application of string techniques to further understanding of modern biology. Text indexing differs from the typical indexing of keys drawn from an underlying total order – text data can have varying lengths, and queries are often more complex and involve substrings, partial matches, or approximate matches.

Queries on...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 285999
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
JPY 285999
Price includes VAT (Japan)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Abouelhoda M, Kurtz S, Ohlebusch E (2004) Replacing suffix trees with enhanced suffix arrays. J Discret Algorithms 2:53–86

    Article  MathSciNet  MATH  Google Scholar 

  2. Aluru S (ed) (2005) Handbook of computational molecular biology, Computer and Information Science Series. Chapman and Hall/CRC, Boca Raton

    MATH  Google Scholar 

  3. Amir A, Kopelowitz T, Lewenstein M, Lewenstein N (2005) Towards real-time suffix tree construction. In: Proceedings of the string processing and information retrieval symposium (SPIRE), pp 67–78

    Google Scholar 

  4. Ciriani V, Ferragina P, Luccio F, Muthukrishnan S (2007) A data structure for a sequence of string accesses in external memory. ACM Trans Algorithms 3

    Google Scholar 

  5. Crescenzi P, Grossi R, Italiano G (2003) Search data structures for skewed strings. In: International workshop on experimental and efficient algorithms (WEA). Lecture notes in computer science, vol 2. Springer, Berlin, pp 81–96

    Google Scholar 

  6. Crochemore M, Rytter W (2002) Jewels of stringology. World Scientific Publishing Company, Singapore

    Book  MATH  Google Scholar 

  7. Ferragina P, Grossi R (1998) Optimal on-line search and sublinear time update in string matching. SIAM J Comput 3:713–736

    Article  MathSciNet  MATH  Google Scholar 

  8. Franceschini G, Grossi R (2004) A general technique for managing strings in comparison-driven data structures. In: Annual international colloquium on automata, languages and programming (ICALP)

    Google Scholar 

  9. Grossi R, Italiano G (1999) Efficient techniques for maintaining multidimensional keys in linked data structures. In: Annual international colloquium on automata, languages and programming (ICALP), pp 372–381

    Google Scholar 

  10. Gusfield D (1997) Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  11. Karkkainen J, Sanders P, Burkhardt S (2006) Linear work suffix arrays construction. J ACM 53:918–936

    Article  MathSciNet  MATH  Google Scholar 

  12. Kasai T, Lee G, Arimura H et al (2001) Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Proceedings of the 12th annual symposium, combinatorial pattern matching (CPM), pp 181–192

    Google Scholar 

  13. Ko P, Aluru S (2005) Space efficient linear time construction of suffix arrays. J Discret Algorithms 3:143–156

    Article  MathSciNet  MATH  Google Scholar 

  14. Ko P, Aluru S (2007) Optimal self-adjustring tree for dynamic string data in secondary storage. In: Proceedings of the string processing and information retrieval symposium (SPIRE), Santiago. Lecture notes in computer science, vol 4726, pp 184–194

    Google Scholar 

  15. Manber U, Myers G (1993) Suffix arrays: a new method for on-line search. SIAM J Comput 22:935–948

    Article  MathSciNet  MATH  Google Scholar 

  16. Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33:31–88

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Srinivas Aluru .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this entry

Cite this entry

Aluru, S. (2016). Text Indexing. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_422

Download citation

Publish with us

Policies and ethics