{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T23:34:51Z","timestamp":1721259291712},"reference-count":69,"publisher":"Association for Computing Machinery (ACM)","funder":[{"name":"FAP-DF","award":["13619.54.33621.1708\/2016, and 0193.001380\/2017"]},{"name":"FAL","award":["2017\/09105-0"]},{"DOI":"10.13039\/501100001807","name":"S\u00e3o Paulo Research Foundation","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001807","id-type":"DOI","asserted-by":"crossref"}]},{"name":"FAP-DF and CNPq","award":["DE 193.001.369\/2016, and PQ 307672\/2017-4"]},{"name":"Basal Funds","award":["FB0001"]},{"name":"Fondecyt","award":["1-200038"]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["ACM J. Exp. Algorithmics"],"published-print":{"date-parts":[[2022,12,31]]},"abstract":"A grammar compression algorithm, called GCIS, is introduced in this work. GCIS is based on the induced suffix sorting algorithm SAIS, presented by Nong et\u00a0al. in 2009. The proposed solution builds on the factorization performed by SAIS during suffix sorting. A context-free grammar is used to replace factors by non-terminals. The algorithm is then recursively applied on the shorter sequence of non-terminals. The resulting grammar is encoded by exploiting some redundancies, such as common prefixes between right-hands of rules, sorted according to SAIS. GCIS excels for its low space and time required for compression while obtaining competitive compression ratios. Our experiments on regular and repetitive, moderate and very large texts, show that GCIS stands as a very convenient choice compared to well-known compressors such as Gzip 7-Zip; and RePair the gold standard in grammar compression; and recent compressors such as SOLCA, LZRR, and LZD. In exchange, GCIS is slow at decompressing. Yet, grammar compressors are more convenient than Lempel-Ziv compressors in that one can access text substrings directly in compressed form without ever decompressing the text. We demonstrate that GCIS is an excellent candidate for this scenario, because it shows to be competitive among its RePair based alternatives. We also show that the relation with SAIS makes GCIS a good intermediate structure to build the suffix array and the LCP array during decompression of the text.<\/jats:p>","DOI":"10.1145\/3549992","type":"journal-article","created":{"date-parts":[[2022,8,26]],"date-time":"2022-08-26T12:10:41Z","timestamp":1661515841000},"page":"1-33","source":"Crossref","is-referenced-by-count":2,"title":["Grammar Compression by Induced Suffix Sorting"],"prefix":"10.1145","volume":"27","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-6870-1397","authenticated-orcid":false,"given":"Daniel S. N.","family":"Nunes","sequence":"first","affiliation":[{"name":"Federal Institute of Bras\u00edlia and Department of Computer Science, University of Bras\u00edlia, Distrito Federal, Brazil"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-2931-1470","authenticated-orcid":false,"given":"Felipe A.","family":"Louza","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering, Federal University of Uberl\u00e2ndia, Minas Gerais, Brazil"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-5450-8630","authenticated-orcid":false,"given":"Simon","family":"Gog","sequence":"additional","affiliation":[{"name":"eBay Inc., San Jose, California, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-0089-3905","authenticated-orcid":false,"given":"Mauricio","family":"Ayala-Rinc\u00f3n","sequence":"additional","affiliation":[{"name":"Departments of Computer Science and Mathematics, University of Bras\u00edlia, Distrito Federal, Brazil"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-2286-741X","authenticated-orcid":false,"given":"Gonzalo","family":"Navarro","sequence":"additional","affiliation":[{"name":"Center for Biotechnology and Bioengineering (CeBiB) and Department of Computer Science, University of Chile, Santiago, Chile"}]}],"member":"320","published-online":{"date-parts":[[2022,8,26]]},"reference":[{"key":"e_1_3_1_2_1","article-title":"Grammar index by induced suffix sorting","volume":"2105","author":"Akagi Tooru","year":"2021","unstructured":"Tooru Akagi, Dominik K\u00f6ppl, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. 2021. Grammar index by induced suffix sorting. CoRR abs\/2105.13744 (2021).","journal-title":"CoRR"},{"key":"e_1_3_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/1712666.1712668"},{"key":"e_1_3_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2017.12.021"},{"key":"e_1_3_1_5_1","doi-asserted-by":"publisher","DOI":"10.1137\/130936889"},{"key":"e_1_3_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipm.2012.08.003"},{"key":"e_1_3_1_7_1","volume-title":"A Block-sorting Lossless Data Compression Algorithm","author":"Burrows Michael","year":"1994","unstructured":"Michael Burrows and David J. Wheeler. 1994. A Block-sorting Lossless Data Compression Algorithm. Technical Report. Digital SRC Research Report."},{"key":"e_1_3_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2005.850116"},{"key":"e_1_3_1_9_1","doi-asserted-by":"publisher","DOI":"10.3233\/FI-2011-565"},{"key":"e_1_3_1_10_1","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1007\/978-3-642-34109-0_19","volume-title":"19th International Symposium on String Processing and Information Retrieval (SPIRE) (LNCS 7608)","author":"Claude Francisco","year":"2012","unstructured":"Francisco Claude and Gonzalo Navarro. 2012. Improved grammar-based compressed indexes. In 19th International Symposium on String Processing and Information Retrieval (SPIRE) (LNCS 7608). Springer, 180\u2013192."},{"key":"e_1_3_1_11_1","article-title":"Genome Reference Consortium Human Reference 37","author":"Consortium Genome Reference","year":"2009","unstructured":"Genome Reference Consortium. 2009. Genome Reference Consortium Human Reference 37. Retrieved from http:\/\/hgdownload.cse.ucsc.edu\/goldenpath\/hg19\/chromosomes\/.","journal-title":"Retrieved from http:\/\/hgdownload.cse.ucsc.edu\/goldenpath\/hg19\/chromosomes\/"},{"key":"e_1_3_1_12_1","article-title":"Silesia Corpus","author":"Deorowicz Sebastian","year":"2003","unstructured":"Sebastian Deorowicz. 2003. Silesia Corpus. Retrieved from http:\/\/sun.aei.polsl.pl\/ sdeor\/index.php?page=silesia.","journal-title":"Retrieved from http:\/\/sun.aei.polsl.pl\/ sdeor\/index.php?page=silesia"},{"key":"e_1_3_1_13_1","first-page":"91","volume-title":"Australasian Computer Science Conference (ACSC)","author":"Dhaliwal Jasbir","year":"2012","unstructured":"Jasbir Dhaliwal, Simon J. Puglisi, and Andrew Turpin.2012. Trends in suffix sorting: A survey of low memory algorithms. In Australasian Computer Science Conference (ACSC). Australian Computer Society Inc., 91\u201398."},{"key":"e_1_3_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC50243.2021.00016"},{"key":"e_1_3_1_15_1","volume-title":"28th International Symposium on String Processing and Information Retrieval (SPIRE)","author":"D\u00edaz-Dom\u00ednguez D.","year":"2021","unstructured":"D. D\u00edaz-Dom\u00ednguez, G. Navarro, and A. Pacheco. 2021. An LMS-based grammar self-index with local consistency properties. In 28th International Symposium on String Processing and Information Retrieval (SPIRE). Retrieved from https:\/\/users.dcc.uchile.cl\/gnavarro\/ps\/spire21.2.pdf."},{"key":"e_1_3_1_16_1","article-title":"Pizza-Chili Corpus","author":"Ferragina Paolo","year":"2005","unstructured":"Paolo Ferragina and Gonzalo Navarro. 2005a. Pizza-Chili Corpus. Retrieved from http:\/\/pizzachili.dcc.uchile.cl\/texts.html.","journal-title":"Retrieved from http:\/\/pizzachili.dcc.uchile.cl\/texts.html"},{"key":"e_1_3_1_17_1","article-title":"Pizza-Chili Repetitive Corpus","author":"Ferragina Paolo","year":"2005","unstructured":"Paolo Ferragina and Gonzalo Navarro. 2005b. Pizza-Chili Repetitive Corpus. Retrieved from http:\/\/pizzachili.dcc.uchile.cl\/repcorpus.html.","journal-title":"Retrieved from http:\/\/pizzachili.dcc.uchile.cl\/repcorpus.html"},{"key":"e_1_3_1_18_1","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1007\/978-3-642-22300-6_32","volume-title":"Workshop on Algorithms and Data Structures (WADS)","author":"Fischer Johannes","year":"2011","unstructured":"Johannes Fischer. 2011. Inducing the LCP-array. In Workshop on Algorithms and Data Structures (WADS)(Lecture Notes in Computer Science, Vol. 6844). Springer, Berlin, 374\u2013385."},{"key":"e_1_3_1_19_1","first-page":"62","volume-title":"Proceedings of the Prague Stringology Conference","author":"Fischer Johannes","year":"2017","unstructured":"Johannes Fischer and Florian Kurpicz. 2017. Dismantling DivSufSort. In Proceedings of the Prague Stringology Conference. Department of Theoretical Computer Science, Faculty of Information Technology, 62\u201376."},{"key":"e_1_3_1_20_1","doi-asserted-by":"crossref","unstructured":"Travis Gagie Tomohiro I. Giovanni Manzini Gonzalo Navarro Hiroshi Sakamoto Louisa Seelbach Benkner and Yoshimasa Takabatake. 2020. Practical Random Access to SLP-Compressed Texts. (2020). Accepted short paper SPIRE .","DOI":"10.1007\/978-3-030-59212-7_16"},{"key":"e_1_3_1_21_1","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1007\/978-3-030-32686-9_3","volume-title":"26th International Symposium on String Processing and Information Retrieval (SPIRE)","author":"Gagie Travis","year":"2019","unstructured":"Travis Gagie, Tomohiro I., Giovanni Manzini, Gonzalo Navarro, Hiroshi Sakamoto, and Yoshimasa Takabatake. 2019. Rpair: Scaling up RePair with rsync. In 26th International Symposium on String Processing and Information Retrieval (SPIRE)(Lecture Notes in Computer Science, Vol. 11811). Springer-Verlag, Berlin, 35\u201344."},{"key":"e_1_3_1_22_1","article-title":"The gzip home page","author":"Gailly Jean-Loup","year":"2011","unstructured":"Jean-Loup Gailly and Mark Adler. 2011. Accessed: 3\/2017. The gzip home page. Retrieved from http:\/\/www.gzip.org\/.","journal-title":"Retrieved from http:\/\/www.gzip.org\/"},{"key":"e_1_3_1_23_1","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"326","DOI":"10.1007\/978-3-319-07046-9","volume-title":"Symposium on Experimental and Efficient Algorithms (SEA)","author":"Gog Simon","year":"2014","unstructured":"Simon Gog, Timo Beller, Alistair Moffat, and Matthias Petri. 2014. From theory to practice: Plug and play with succinct data structures. In Symposium on Experimental and Efficient Algorithms (SEA)(Lecture Notes in Computer Science, Vol. 8504). Springer, Cham, 326\u2013337."},{"key":"e_1_3_1_24_1","first-page":"25","volume-title":"Workshop on Algorithm Engineering and Experimentation (ALENEX)","author":"Gog Simon","year":"2011","unstructured":"Simon Gog and Enno Ohlebusch. 2011. Fast and lightweight LCP-array construction algorithms. In Workshop on Algorithm Engineering and Experimentation (ALENEX). ACM Digital Library, 25\u201334."},{"key":"e_1_3_1_25_1","first-page":"66","volume-title":"Information Retrieval","author":"Gonnet Gaston H.","year":"1992","unstructured":"Gaston H. Gonnet, Ricardo A. Baeza-Yates, and Tim Snider. 1992. New indices for text: PAT trees and PAT arrays. In Information Retrieval. Prentice-Hall, Inc., Upper Saddle River, NJ, 66\u201382."},{"key":"e_1_3_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC.2014.62"},{"key":"e_1_3_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-19929-0_19"},{"key":"e_1_3_1_28_1","article-title":"Shaped SLP implementation","author":"I Tomohiro","year":"2020","unstructured":"Tomohiro I. 2020. Shaped SLP implementation. Retrieved from https:\/\/github.com\/itomomoti\/ShapedSlp.","journal-title":"Retrieved from https:\/\/github.com\/itomomoti\/ShapedSlp"},{"key":"e_1_3_1_29_1","first-page":"81","volume-title":"International Symposium on String Processing and Information Retrieval (SPIRE)","author":"Itoh Hideo","year":"1999","unstructured":"Hideo Itoh and Hozumi Tanaka. 1999. An efficient method for in memory construction of suffix arrays. In International Symposium on String Processing and Information Retrieval (SPIRE). IEEE, NY, 81\u201388."},{"key":"e_1_3_1_30_1","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1007\/978-3-642-38905-4_19","volume-title":"Annual Symposium on Combinatorial Pattern Matching (CPM)","author":"K\u00e4rkk\u00e4inen Juha","year":"2013","unstructured":"Juha K\u00e4rkk\u00e4inen, Dominik Kempa, and Simon J. Puglisi. 2013. Linear time Lempel-Ziv factorization: Simple, fast, small. In Annual Symposium on Combinatorial Pattern Matching (CPM)(Lecture Notes in Computer Science, Vol. 7922). Springer, Berlin, 189\u2013200."},{"key":"e_1_3_1_31_1","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1007\/978-3-642-02441-2_17","volume-title":"Annual Symposium on Combinatorial Pattern Matching (CPM)","author":"K\u00e4rkk\u00e4inen Juha","year":"2009","unstructured":"Juha K\u00e4rkk\u00e4inen, Giovanni Manzini, and Simon J. Puglisi. 2009. Permuted longest-common-prefix array. In Annual Symposium on Combinatorial Pattern Matching (CPM)(Lecture Notes in Computer Science, Vol. 5577). Springer, Berlin, 181\u2013192."},{"key":"e_1_3_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/18.841160"},{"key":"e_1_3_1_33_1","series-title":"Lecture Notes in Computer Science","first-page":"200","volume-title":"Annual Symposium on Combinatorial Pattern Matching (CPM)","author":"Ko Pang","year":"2003","unstructured":"Pang Ko and Srinivas Aluru. 2003. Space efficient linear time construction of suffix arrays. In Annual Symposium on Combinatorial Pattern Matching (CPM)(Lecture Notes in Computer Science, Vol. 2676). Springer, Berlin, 200\u2013210."},{"key":"e_1_3_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-020-00722-6"},{"key":"e_1_3_1_35_1","article-title":"Sais-lite suffix and LCP arrays construction algorithm","author":"Kurpicz Florian","year":"2015","unstructured":"Florian Kurpicz. 2015. Sais-lite suffix and LCP arrays construction algorithm. Retrieved from https:\/\/github.com\/kurpicz\/sais-lite-lcp.","journal-title":"Retrieved from https:\/\/github.com\/kurpicz\/sais-lite-lcp"},{"key":"e_1_3_1_36_1","article-title":"DivSufSort suffix and LCP arrays construction algorithm","author":"Kurpicz Florian","year":"2016","unstructured":"Florian Kurpicz. 2016. DivSufSort suffix and LCP arrays construction algorithm. Retrieved from https:\/\/github.com\/kurpicz\/libdivsufsort.","journal-title":"Retrieved from https:\/\/github.com\/kurpicz\/libdivsufsort"},{"key":"e_1_3_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC.1999.755679"},{"key":"e_1_3_1_38_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.2377"},{"key":"e_1_3_1_39_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2017.03.039"},{"key":"e_1_3_1_40_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ipl.2016.09.010"},{"key":"e_1_3_1_41_1","article-title":"Large Text Compression Benchmark","author":"Mahoney Matt","year":"2006","unstructured":"Matt Mahoney. 2006. Large Text Compression Benchmark. Retrieved from http:\/\/mattmahoney.net\/dc\/text.html.","journal-title":"Retrieved from http:\/\/mattmahoney.net\/dc\/text.html"},{"key":"e_1_3_1_42_1","doi-asserted-by":"publisher","DOI":"10.1137\/0222058"},{"key":"e_1_3_1_43_1","article-title":"Manzini\u2019s Lightweight Corpus","author":"Manzini Giovani","year":"2003","unstructured":"Giovani Manzini. 2003. Manzini\u2019s Lightweight Corpus. Retrieved from http:\/\/people.unipmn.it\/ manzini\/lightweight\/.","journal-title":"Retrieved from http:\/\/people.unipmn.it\/ manzini\/lightweight\/"},{"key":"e_1_3_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC.2014.69"},{"key":"e_1_3_1_45_1","article-title":"DivSufSort suffix array construction algorithm","author":"Mori Yuta","year":"2008","unstructured":"Yuta Mori. 2008. DivSufSort suffix array construction algorithm. Retrieved from https:\/\/github.com\/y-256\/libdivsufsort.","journal-title":"Retrieved from https:\/\/github.com\/y-256\/libdivsufsort"},{"key":"e_1_3_1_46_1","article-title":"Sais-lite suffix sorting algorithm","author":"Mori Yuta","year":"2010","unstructured":"Yuta Mori. 2010. Sais-lite suffix sorting algorithm. Retrieved from https:\/\/sites.google.com\/site\/yuta256\/sais.","journal-title":"Retrieved from https:\/\/sites.google.com\/site\/yuta256\/sais"},{"key":"e_1_3_1_47_1","article-title":"Salmonella enterica subsp. enterica serovar Paratyphi B str. SPB7, complete sequence","year":"2007","unstructured":"NCBI. 2007. Salmonella enterica subsp. enterica serovar Paratyphi B str. SPB7, complete sequence. Retrieved from https:\/\/www.ncbi.nlm.nih.gov\/nuccore\/NC_010102.","journal-title":"Retrieved from https:\/\/www.ncbi.nlm.nih.gov\/nuccore\/NC_010102"},{"key":"e_1_3_1_48_1","article-title":"Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome","year":"2020","unstructured":"NCBI. 2020. Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome. Retrieved from https:\/\/www.ncbi.nlm.nih.gov\/nuccore\/NC_045512.","journal-title":"Retrieved from https:\/\/www.ncbi.nlm.nih.gov\/nuccore\/NC_045512"},{"key":"e_1_3_1_49_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.dam.2019.01.014"},{"key":"e_1_3_1_50_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC.2019.00029"},{"key":"e_1_3_1_51_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ic.2020.104518"},{"key":"e_1_3_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/2493175.2493180"},{"key":"e_1_3_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/DCC.2009.42"},{"key":"e_1_3_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2010.188"},{"key":"e_1_3_1_55_1","first-page":"42","volume-title":"IEEE Data Compression Conference (DCC)","author":"Nunes Daniel Saad Nogueira","year":"2018","unstructured":"Daniel Saad Nogueira Nunes, Felipe Alves da Louza, Simon Gog, Mauricio Ayala-Rinc\u00f3n, and Gonzalo Navarro. 2018. A grammar compression algorithm based on induced suffix sorting. In IEEE Data Compression Conference (DCC). IEEE, NY, 42\u201351."},{"key":"e_1_3_1_56_1","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1007\/978-3-642-21458-5_4","volume-title":"Annual Symposium on Combinatorial Pattern Matching (CPM)","author":"Ohlebusch Enno","year":"2011","unstructured":"Enno Ohlebusch and Simon Gog. 2011. Lempel-Ziv factorization revisited. In Annual Symposium on Combinatorial Pattern Matching (CPM)(Lecture Notes in Computer Science, Vol. 6661). Springer, Berlin, 15\u201326."},{"key":"e_1_3_1_57_1","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1007\/978-3-642-03784-9_9","volume-title":"International Symposium on String Processing and Information Retrieval (SPIRE)","author":"Okanohara Daisuke","year":"2009","unstructured":"Daisuke Okanohara and Kunihiko Sadakane. 2009. A linear-time burrows-wheeler transform using induced sorting. In International Symposium on String Processing and Information Retrieval (SPIRE)(Lecture Notes in Computer Science, Vol. 5721). Springer, Berlin, 90\u2013101."},{"key":"e_1_3_1_58_1","article-title":"The 7zip home page","author":"Pavlov Igor","year":"2016","unstructured":"Igor Pavlov. 2016. Accessed: 10\/2017. The 7zip home page. Retrieved from http:\/\/www.7-zip.org\/.","journal-title":"Retrieved from http:\/\/www.7-zip.org\/"},{"key":"e_1_3_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242471.1242472"},{"key":"e_1_3_1_60_1","article-title":"The bzip home page","author":"Seward Julian","year":"1996","unstructured":"Julian Seward. 1996. The bzip home page. Retrieved from http:\/\/www.bzip.org\/.","journal-title":"Retrieved from http:\/\/www.bzip.org\/"},{"key":"e_1_3_1_61_1","article-title":"PPMd algorithm variant j revision 1","author":"Shkarin Dmitry","year":"2006","unstructured":"Dmitry Shkarin. 2006. PPMd algorithm variant j revision 1. Retrieved from http:\/\/www.compression.ru\/ds\/.","journal-title":"Retrieved from http:\/\/www.compression.ru\/ds\/"},{"key":"e_1_3_1_62_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ESA.2017.67"},{"key":"e_1_3_1_63_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-07959-2_29"},{"key":"e_1_3_1_64_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-23826-5_25"},{"key":"e_1_3_1_65_1","article-title":"Andrew Trigell\u2019s Large Corpus","author":"Trigell Andrew","year":"1998","unstructured":"Andrew Trigell. 1998. Andrew Trigell\u2019s Large Corpus. Retrieved from https:\/\/www.samba.org\/ftp\/tridge\/large-corpus\/.","journal-title":"Retrieved from https:\/\/www.samba.org\/ftp\/tridge\/large-corpus\/"},{"key":"e_1_3_1_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/2433396.2433409"},{"key":"e_1_3_1_67_1","article-title":"Offline Dictionary-based Compression (RePair, Recursive Pairing)","author":"Wan Raymond","year":"2014","unstructured":"Raymond Wan. 2014. Offline Dictionary-based Compression (RePair, Recursive Pairing). Retrieved from https:\/\/github.com\/rwanwork\/Re-Pair.","journal-title":"Retrieved from https:\/\/github.com\/rwanwork\/Re-Pair"},{"key":"e_1_3_1_68_1","article-title":"Wikipedia\u2019s Pages and Articles XML Dump","year":"2019","unstructured":"Wikipedia. 2019. Wikipedia\u2019s Pages and Articles XML Dump. Retrieved from http:\/\/wikipedia.c3sl.ufpr.br\/enwiki\/20191120\/.","journal-title":"Retrieved from http:\/\/wikipedia.c3sl.ufpr.br\/enwiki\/20191120\/"},{"key":"e_1_3_1_69_1","volume-title":"Managing Gigabytes (2nd Ed.): Compressing and Indexing Documents and Images","author":"Witten Ian H.","year":"1999","unstructured":"Ian H. Witten, Alistair Moffat, and Timothy C. Bell. 1999. Managing Gigabytes (2nd Ed.): Compressing and Indexing Documents and Images. Morgan Kaufmann, San Francisco, CA."},{"key":"e_1_3_1_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1977.1055714"}],"container-title":["ACM Journal of Experimental Algorithmics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3549992","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,2]],"date-time":"2023-01-02T12:19:23Z","timestamp":1672661963000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3549992"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,26]]},"references-count":69,"alternative-id":["10.1145\/3549992"],"URL":"https:\/\/doi.org\/10.1145\/3549992","relation":{},"ISSN":["1084-6654","1084-6654"],"issn-type":[{"value":"1084-6654","type":"print"},{"value":"1084-6654","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,26]]}}}