Abstract
A linear-time approximation algorithm for the grammar-based compression, which is an optimization problem to minimize the size of a context-free grammar deriving a given string, is presented. For each string of length n over unbounded alphabet, the algorithm guarantees O(log2 n) approximation ratio without suffix tree and runs in O(n) time in the sense of randomized model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti-Spaccamela, and M. Protasi. Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer, 1999.
S. De Agostino and J. A. Storer. On-Line versus Off-Line Computation in Dynamic Text Compression. Inform. Process. Lett., 59:169–174, 1996.
M. Charikar, E. Lehman, D. Liu, R. Panigrahy, M. Prabhakaran, A. Rasala, A. Sahai, and A. Shelat. Approximating the Smallest Grammar: Kolmogorov Complexity in Natural Models. In Proc. 29th Ann. Sympo. on Theory of Computing, 792–801, 2002.
D. Gusfield. Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, 1997.
T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. Collage System: a Unifying Framework for Compressed Pattern Matching. Theoret. Comput. Sci. (to appear).
J. C. Kieffer and E.-H. Yang. Grammar-Based Codes: a New Class of Universal Lossless Source Codes. IEEE Trans. on Inform. Theory, 46(3):737–754, 2000.
J. C. Kieffer, E.-H. Yang, G. Nelson, and P. Cosman. Universal Lossless Compression via Multilevel Pattern Matching. IEEE Trans. Inform. Theory, IT-46(4), 1227–1245, 2000.
D. Knuth. Seminumerical Algorithms. Addison-Wesley, 441–462, 1981.
N. J. Larsson and A. Moffat. Offline Dictionary-Based Compression. Proceedings of the IEEE, 88(11):1722–1732, 2000.
E. Lehman. Approximation Algorithms for Grammar-Based Compression. PhD thesis, MIT, 2002.
E. Lehman and A. Shelat. Approximation Algorithms for Grammar-Based Compression. In Proc. 20th Ann. ACM-SIAM Sympo. on Discrete Algorithms, 205–212, 2002.
M. Lothaire. Combinatorics on Words, volume 17 of Encyclopedia of Mathematics and Its Applications. Addison-Wesley, 1983.
M. Farach. Optimal Suffix Tree Construction with Large Alphabets. In Proc. 38th Ann. Sympo. on Foundations of Computer Science, 137–143, 1997.
C. Nevill-Manning and I. Witten. Compression and Explanation Using Hierarchical Grammars. Computer Journal, 40(2/3):103–116, 1997.
W. Rytter. Application of Lempel-Ziv Factorization to the Approximation of Grammar-Based Compression. In Proc. 13th Ann. Sympo. Combinatorial Pattern Matching, 20–31, 2002.
J. A. Storer and T. G. Szymanski. The Macro Model for Data Compression. In Proc. 10th Ann. Sympo. on Theory of Computing, pages 30–39, San Diego, California, 1978. ACM Press.
T. A. Welch. A Technique for High Performance Data Compression. IEEE Comput., 17:8–19, 1984.
E.-H. Yang and J. C. Kieffer. Efficient Universal Lossless Data Compression Algorithms Based on a Greedy Sequential Grammar Transform-Part One: without Context Models. IEEE Trans. on Inform. Theory, 46(3):755–777, 2000.
J. Ziv and A. Lempel. A Universal Algorithm for Sequential Data Compression. IEEE Trans. on Inform. Theory, IT-23(3):337–349, 1977.
J. Ziv and A. Lempel. Compression of Individual Sequences via Variable-Rate Coding. IEEE Trans. on Inform. Theory, 24(5):530–536, 1978.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sakamoto, H. (2003). A Fully Linear-Time Approximation Algorithm for Grammar-Based Compression. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds) Combinatorial Pattern Matching. CPM 2003. Lecture Notes in Computer Science, vol 2676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44888-8_25
Download citation
DOI: https://doi.org/10.1007/3-540-44888-8_25
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40311-1
Online ISBN: 978-3-540-44888-4
eBook Packages: Springer Book Archive