Abstract
In this paper we are interested in the study of the combinatorial aspects connecting three important constructions in the field of string algorithms: the suffix array, the Burrows-Wheeler transform (BWT) and the extended Burrows-Wheeler transform (EBWT). Such constructions involve the notions of suffixes and conjugates of words and are based on two different order relations, denoted by < lex and ≺ ω , that, even if strictly connected, are quite different from the computational point of view. In this study an important role is played by Lyndon words. In particular, we improve the upper bound on the number of symbol comparisons needed to establish the ≺ ω order between two primitive words by using a preliminary knowledge of the < lex order of the corresponding Lyndon conjugates. Moreover, we propose an algorithm that efficiently sorts, according to the ≺ ω order, the list of conjugates of a multiset of Lyndon words. Finally, we show that the Lyndon factorization of a word helps the construction of its suffix array, allowing a reduction of the number of symbol comparisons needed to lexicographically sort the suffixes of the word.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adjeroh, D., Bell, T., Mukherjee, A.: The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching, 1st edn. Springer Publishing Company, Incorporated (2008)
Bauer, M.J., Cox, A.J., Rosone, G.: Lightweight BWT construction for very large string collections. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 219–231. Springer, Heidelberg (2011)
Bauer, M.J., Cox, A.J., Rosone, G.: Lightweight algorithms for constructing and inverting the BWT of string collections. Theoret. Comput. Sci. 483, 134–148 (2013)
Bauer, M.J., Cox, A.J., Rosone, G., Sciortino, M.: Lightweight LCP construction for next-generation sequencing datasets. In: Raphael, B., Tang, J. (eds.) WABI 2012. LNCS, vol. 7534, pp. 326–337. Springer, Heidelberg (2012)
Burrows, M., Wheeler, D.J.: A block sorting data compression algorithm. Technical report, DIGITAL System Research Center (1994)
Chen, K.T., Fox, R.H., Lyndon, R.C.: Free differential calculus. IV. The quotient groups of the lower central series. Ann. of Math. 68(2), 81–95 (1958)
Crochemore, M., Désarménien, J., Perrin, D.: A note on the Burrows-Wheeler transformation. Theoret. Comput. Sci. 332, 567–572 (2005)
Duval, J.-P.: Factorizing words over an ordered alphabet. Journal of Algorithms 4(4), 363–381 (1983)
Duval, J.-P., Lefebvre, A.: Words over an ordered alphabet and suffix permutations. RAIRO Theor. Inform. Appl. 36(3), 249–259 (2002)
Ferragina, P., Giancarlo, R., Manzini, G., Sciortino, M.: Boosting textual compression in optimal linear time. J. ACM 52(4), 688–713 (2005)
Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, pp. 390–398. IEEE Computer Society (2000)
Fine, N.J., Wilf, H.S.: Uniqueness theorem for periodic functions. Proc. Am. Mathematical Society (16), 109–114 (1965)
Gessel, I.M., Reutenauer, C.: Counting permutations with given cycle structure and descent set. J. Combin. Theory Ser. A 64(2), 189–215 (1993)
Giancarlo, R., Restivo, A., Sciortino, M.: From first principles to the Burrows and Wheeler transform and beyond, via combinatorial optimization. Theoret. Comput. Sci. 387(3), 236–248 (2007)
Gil, J.Y., Scott, D.A.: A bijective string sorting transform. CoRR, abs/1201.3077 (2012)
Hohlweg, C., Reutenauer, C.: Lyndon words, permutations and trees. Theoret. Comput. Sci. 307(1), 173–178 (2003)
Hon, W.-K., Ku, T.-H., Lu, C.-H., Shah, R., Thankachan, S.V.: Efficient algorithm for circular Burrows-Wheeler transform. In: Kärkkäinen, J., Stoye, J. (eds.) CPM 2012. LNCS, vol. 7354, pp. 257–268. Springer, Heidelberg (2012)
Kucherov, G., Tóthmérész, L., Vialette, S.: On the combinatorics of suffix arrays. CoRR, abs/1206.3877 (2012)
Kufleitner, M.: On bijective variants of the Burrows-Wheeler transform. In: Proceedings of the Prague Stringology Conference 2009, pp. 65–79 (2009)
Lothaire, M.: Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications). Cambridge University Press, New York (2005)
Mantaci, S., Restivo, A., Rosone, G., Sciortino, M.: An Extension of the Burrows Wheeler Transform and Applications to Sequence Comparison and Data Compression. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 178–189. Springer, Heidelberg (2005)
Mantaci, S., Restivo, A., Rosone, G., Sciortino, M.: An extension of the Burrows-Wheeler Transform. Theoret. Comput. Sci. 387(3), 298–312 (2007)
Mantaci, S., Restivo, A., Rosone, G., Sciortino, M.: A new combinatorial approach to sequence comparison. Theory Comput. Syst. 42(3), 411–429 (2008)
Navarro, G., Nekrich, Y.: Optimal dynamic sequence representations. In: Proc. 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 865–876 (2013)
Puglisi, S.J., Smyth, W.F., Turpin, A.H.: A taxonomy of suffix array construction algorithms. ACM Comput. Surv. 39 (2007)
Seward, J.: The bzip2 home page, http://www.bzip.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bonomo, S., Mantaci, S., Restivo, A., Rosone, G., Sciortino, M. (2013). Suffixes, Conjugates and Lyndon Words. In: Béal, MP., Carton, O. (eds) Developments in Language Theory. DLT 2013. Lecture Notes in Computer Science, vol 7907. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38771-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-38771-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38770-8
Online ISBN: 978-3-642-38771-5
eBook Packages: Computer ScienceComputer Science (R0)