Improving Time and Space Complexity for Compressed Pattern Matching | SpringerLink
Skip to main content

Improving Time and Space Complexity for Compressed Pattern Matching

  • Conference paper
Algorithms and Computation (ISAAC 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4288))

Included in the following conference series:

Abstract

The compressed pattern matching problem is to find all occurrences of a given pattern in a compressed text. In this paper an efficient grammar-based compression algorithm is presented for the compressed pattern matching. The algorithm achieves the worst-case approximation ratio O(g *logg *logn) for the optimum grammar size g * with an input text of length n. This upper bound improves the complexity of the compressed pattern matching problem to \(O(g_*\log g_*\log m + \frac{n}{m} + m^2 + r)\) time and O(g *logg *logm + m 2) space for any pattern shorter than m and the number r of pattern occurrences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Amir, A., Benson, G.: Efficient two-dimensional compressed matching. In: Proc. Data Compression Conference, p. 279 (1992)

    Google Scholar 

  2. Amir, A., Benson, G., Farach, M.: Let sleeping files lie: pattern matching in Z-compressed files. J. Comput. System Sci. 52, 299–307 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  3. Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  4. Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Rasala, A., Sahai, A., Shelat, A.: Approximating the Smallest Grammar: Kolmogorov Complexity in Natural Models. In: Proc. 29th Ann. Sympo. on Theory of Computing, pp. 792–801 (2002)

    Google Scholar 

  5. Farach, M., Thorup, M.: String-matching in Lempel-Ziv compressed strings. In: 27th ACM STOC, pp. 703–713 (1995)

    Google Scholar 

  6. Gusfield, D.: Algorithms on Strings, Trees, and Sequences, Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  7. Kida, T., Shibata, Y., Takeda, M., Shinohara, A., Arikawa, S.: Collage System: a Unifying Framework for Compressed Pattern Matching. Theoret. Comput. Sci. 298, 253–272

    Google Scholar 

  8. Kida, T., Takeda, M., Shinohara, A., Miyazaki, M., Arikawa, S.: Multiple pattern matching in LZW compressed text. J. Discrete Algorithms 1(1), 133–158 (2000)

    MathSciNet  Google Scholar 

  9. Kieffer, J.C., Yang, E.-H.: Grammar-Based Codes: a New Class of Universal Lossless Source Codes. IEEE Trans. on Inform. Theory 46(3), 737–754 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  10. Kieffer, J.C., Yang, E.-H., Nelson, G., Cosman, P.: Universal Lossless Compression via Multilevel Pattern Matching. IEEE Trans. Inform. Theory IT-46(4), 1227–1245 (2000)

    Article  MathSciNet  Google Scholar 

  11. Knuth, D.: Seminumerical Algorithms, pp. 441–462. Addison-Wesley, Reading (1981)

    MATH  Google Scholar 

  12. Larsson, N.J., Moffat, A.: Offline Dictionary-Based Compression. Proceedings of the IEEE 88(11), 1722–1732 (2000)

    Article  Google Scholar 

  13. Lehman, E.: Approximation Algorithms for Grammar-Based Compression, PhD thesis, MIT (2002)

    Google Scholar 

  14. Lehman, E., Shelat, A.: Approximation Algorithms for Grammar-Based Compression. In: Proc. 20th Ann. ACM-SIAM Sympo. on Discrete Algorithms, pp. 205–212 (2002)

    Google Scholar 

  15. Lothaire, M.: Combinatorics on Words. Encyclopedia of Mathematics and Its Applications, vol. 17. Addison-Wesley, Reading (1983)

    MATH  Google Scholar 

  16. Navarro, G., Raffinot, M.: A general practical approach to pattern matching over Ziv-Lempel compressed text. In: Crochemore, M., Paterson, M. (eds.) CPM 1999. LNCS, vol. 1645, pp. 14–36. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  17. Nevill-Manning, C., Witten, I.: Compression and Explanation Using Hierarchical Grammars. Computer Journal 40(2/3), 103–116 (1997)

    Article  Google Scholar 

  18. Nevill-Manning, C., Witten, I.: Identifying hierarchical structure in sequences: a linear-time algorithm. J. Artificial Intelligence Research 7, 67–82 (1997)

    MATH  Google Scholar 

  19. Rytter, W.: Application of Lempel-Ziv Factorization to the Approximation of Grammar-Based Compression. In: Proc. 13th Ann. Sympo. Combinatorial Pattern Matching, pp. 20–31 (2002)

    Google Scholar 

  20. Sakamoto, H.: A Fully Linear-Time Approximation Algorithm for Grammar-Based Compression. Journal of Discrete Algorithms 3, 416–430 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  21. Sakamoto, H., Kida, T., Shimozono, S.: A Space-Saving Linear-Time Algorithm for Grammar-Based Compression. In: Proc. 11th International Symposium on String Processing and Information Retrieval, pp. 218–229 (2004)

    Google Scholar 

  22. Salomon, D.: Data compression: the complete reference, 2nd edn. Springer, Heidelberg (1998)

    Google Scholar 

  23. Storer, J., Szymanski, T.: Data compression via textual substitution. J. Assoc. Comput. Mach. 29(4), 928–951 (1982)

    MATH  MathSciNet  Google Scholar 

  24. Storer, J.A., Szymanski, T.G.: The Macro Model for Data Compression. In: Proc. 10th Ann. Sympo. on Theory of Computing, pp. 30–39 (1978)

    Google Scholar 

  25. Welch, T.A.: A Technique for High Performance Data Compression. IEEE Comput. 17, 8–19 (1984)

    Google Scholar 

  26. Wu, S., Manber, U.: Agrep–a fast approximate pattern-matching tool. In: Usenix Winter 1992 Technical Conference, pp. 153–162 (1992)

    Google Scholar 

  27. Yang, E.-H., Kieffer, J.C.: Efficient Universal Lossless Data Compression Algorithms Based on a Greedy Sequential Grammar Transform–Part One: without Context Models. IEEE Trans. on Inform. Theory 46(3), 755–777 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  28. Ziv, J., Lempel, A.: A Universal Algorithm for Sequential Data Compression. IEEE Trans. on Inform. Theory IT-23(3), 337–349 (1977)

    Article  MathSciNet  Google Scholar 

  29. Ziv, J., Lempel, A.: Compression of Individual Sequences via Variable-Rate Coding. IEEE Trans. on Inform. Theory 24(5), 530–536 (1978)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maruyama, S., Miyagawa, H., Sakamoto, H. (2006). Improving Time and Space Complexity for Compressed Pattern Matching. In: Asano, T. (eds) Algorithms and Computation. ISAAC 2006. Lecture Notes in Computer Science, vol 4288. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940128_49

Download citation

  • DOI: https://doi.org/10.1007/11940128_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49694-6

  • Online ISBN: 978-3-540-49696-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics