Abstract
In this paper, a novel linguistic steganography with high imperceptibility and undetectability is proposed via secret message compression and candidate text selection. The length of the practical embedded payload can be reduced by the proposed word indexing compression algorithm(WIC), while a best stego text with high undetectability can be selected from candidates by the stego text selection strategy. WIC algorithm losslessly compresses the secret message by combining a minimum maximum weight algorithm with Huffman coding under the help of the candidate cover text. To improve the anti-steganalysis capability, ten cover texts with small compression ratios are selected from a huge cover text set, and are embedded the corresponding compressed secret message by using synonym substitutions. Only one stego text is selected by a given rule derived from the distance between a cover text and its stego text. Experimental results show that the proposed compression algorithm achieves better compression ratios than Huffman and LZW coding algorithms leading to higher embedding efficiency, and our steganography performs well in anti-steganalysis capability with compression and the stego text selection rule.
Similar content being viewed by others
References
Atallah MJ, Raskin V, Hempelmann C, Karahan M, Sion R, Topkara U, Triezenberg KE (2002) Natural Language Watermarking and Tamperproofing. In: The 5th International Workshop on Information Hiding. Springer-Verlag, Berlin, pp 196–212
Bergmair R (2007) A comprehensive bibliography of linguistic steganography. In: Proceeding of SPIE 6505, Security, Steganography, and Watermarking of Multimedia Contents IX, pp 65050W
Bochkarev V V, Shevlyakova A V, Solovyev V D (2012) Average word length dynamics as indicator of cultural changes in society. Computer Science. arXiv:1208.6109
Bolshakov IA (2004) A method of linguistic steganography based on collocationally-verified synonymy. In: Information Hiding. Springer, Berlin, pp 180–191
Chang C Y, Clark S (2014) Practical linguistic steganography using contextual synonym substitution and a novel vertex coding method. Comput Linguist 40(2):403–448
Chen Z, Huang L, Yang W (2011) Detection of substitution-based linguistic steganography by relative frequency analysis. Digit Investig 8(1):68–77
Crandall R (1998) Some notes on steganography. http://dde.binghamton.edu/download/Crandall_matrix.pdf. Accessed 1 August 2017
Denemark T, Fridrich J (2017) Steganography with multiple JPEG images of the same scene[J]. IEEE Trans Inf Forensic Secur 12(10):2308–2319
Fridrich J, Soukal D (2006) Matrix embedding for large payloads. IEEE Trans Inf Forensic Secur 1(3):390–395
Fridrich J, Goljan M, Soukal D (2006) Wet paper codes with improved embedding efficiency. IEEE Trans Inf Forensic Secur 1(1):102–110
Hu X, Luo G, Lu Y, Xiang L (2013) A steganography on synonym frequency distribution. Adv Inf Sci Serv Sci 5(10):206–214
Li J, Huang X, Li J, Chen X, Xiang Y (2014) Securely Outsourcing Attribute-based Encryption with Checkability. IEEE Trans Parallel Distrib Syst 25 (8):2201–2210
Li J, Yu C, Gupta B B, Ren X (2018) Color image watermarking scheme based on quaternion Hadamard transform and Schur decomposition. Multimed Tools Appl 77(4):4545–4561
Liu Y L, Sun X, Gan C, Wang H (2007) An efficient linguistic steganography for Chinese text. IEEE International Conference on Multimedia and Expo. IEEE, pp 2094–2097
Meral H M, Sankur B, ozsoy A S, Gungor T, Sevinc E (2009) Natural language watermarking via morphosyntactic alterations. Comput Speech Lang 23(1):107–125
Muhammad H Z, Rahman S M S A A, Shakil A (2009) Synonym based Malay linguistic text steganography. Innovative Technologies in Intelligent Systems and Industrial Applications, CITISIA 2009. IEEE, pp 423–427
Topkara U, Topkara M, Atallah M J (2006) The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions. In: Proceedings of the 8th workshop on Multimedia and security. ACM, pp 164–174
Westfeld A (2001) F5-a steganographic algorithm high capacity despite better steganalysis. Lect Notes Comput Sci 2137:289–302
Winstein K (1998) Lexical steganography through adaptive modulation of the word choice hash. http://www.imsa.edu/?keithw/tlex. Accessed 2 August 2017
Xiang L, Sun X, Luo G, Xia B (2014) Linguistic steganalysis using the features derived from synonym frequency. Multimed Tools Appl 71(3):1893–1911
Yang X, Li F, Xiang L (2015) Synonym Substitution-based Steganographic Algorithm with Matrix Coding. J Chin Comput Syst 36(6):1296–1300
Yu C, Li J, Li X et al. (2018) Four-image encryption scheme based on quaternion Fresnel transform, chaos and computer generated hologram[J]. Multimed Tools Appl 77(4):4585–4608
Acknowledgements
This work has been performed in the Project “Research on Theory and Approach of Secure Text Steganography” supported by National Natural Science Foundation of China (No. 61202439), and partly supported by National Natural Science Foundation of China (No. 61302159).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that they have no conflict of interest. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. This article does not contain any studies with animals performed by any of the authors. Informed consent was obtained from all individual participants included in the study.
Rights and permissions
About this article
Cite this article
Xiang, L., Wu, W., Li, X. et al. A linguistic steganography based on word indexing compression and candidate selection. Multimed Tools Appl 77, 28969–28989 (2018). https://doi.org/10.1007/s11042-018-6072-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6072-8