Graphical-character-based shredded Chinese document reconstruction

Xing, Nan; Zhang, Jianqi

doi:10.1007/s11042-016-3685-7

Graphical-character-based shredded Chinese document reconstruction

Published: 01 July 2016

Volume 76, pages 12871–12891, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Nan Xing¹ &
Jianqi Zhang¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Paper documents are shredded into pieces by a shredder in what is currently a common means of ensuring text information security. Because such pieces have certain characteristics, such as being of large number and low discrimination, shredded document reconstruction by a reverse operation represents a challenge. However, recovering shredded documents is an important research aspect of digital forensics and has broad applicability in information security and judicial investigations. Researchers have proposed various feasible algorithms to restore shredded documents; however, most such algorithms are aimed at western language documents. Because of large differences between languages, these algorithms are difficult to apply to other language document reconstruction tasks directly. The Chinese language is used worldwide. Chinese documents are also widely used; accordingly, there are great demands for Chinese document reconstruction. This paper presents a complete shredded Chinese document reconstruction algorithm. According to the structural features of the characters, we apply graphics processing to the texts in pieces, the pieces are matched by graph assembling, and the shredded document is restored. We test the algorithm’s performance using an actual sample, and the experimental results show that the proposed method can effectively restore the shredded document. The average obtained accuracy is 85.78 %. Moreover, the algorithm is highly intelligent; a human only participates in the step that involves scanning the pieces, and the other calculation steps are automatically completed by the computer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A Semi-automatic Solution Archive for Cross-Cut Shredded Text Documents Reconstruction

A reconstruction method for cross-cut shredded documents based on the extreme learning machine algorithm

Article 24 July 2022

A high splicing accuracy solution to reconstruction of cross-cut shredded text document problem

Article 13 November 2017

Notes

Punctuation symbols described in this paper are defined by General Rules for Punctuation ( GB/T 15834–2011) promulgated by the General Administration of Quality Supervision, Inspection and Quarantine of the People’s Republic of China in 2011.

References

Biswas A, Bhowmick P, Bhattacharya BB (2005) Reconstruction of torn documents using contour maps. In: Proc. IEEE International Conference on Image Processing, Volume 3. IEEE, Los Alamitos, CA, pp III-517-520. doi:10.1109/ICIP.2005.1530442
Butler P, Chakraborty P, Ramakrishan N (2012) The Deshredder: A visual analytic approach to reconstructing shredded documents. In: Proceedings of the 2012 IEEE conference on visual analytics science and technology. IEEE. In: Los Alamitos, pp 113–122. doi:10.1109/VAST.2012.6400560
Chan AH, Tsang SN, Ng AW (2014) Effects of line length, line spacing, and line number on proofreading performance and scrolling of Chinese text. Hum Factors 56:521–534. doi:10.1177/0018720813499368
Article Google Scholar
Cheng H (2015) Effects of font size and spacing on Chinese reading the newspaper material in urban low-age senior citizens. Masters Thesis. Tianjin Normal University
De Smet P, De Bock J, Philips W (2005) Semiautomatic reconstruction of strip-shredded documents. Proc SPIE 5685:239–248. doi:10.1117/12.586340
Article Google Scholar
Freeman H (1961) On the encoding of arbitrary geometric configurations. IEEE Trans Electron Comput 2:260–268. doi:10.1109/TEC.1961.5219197
Article MathSciNet Google Scholar
Freeman H, Garder L (1964) Apictorial jigsaw puzzles: the computer solution of a problem in pattern recognition. IEEE Trans Electron 2:118–127. doi:10.1109/PGEC.1964.263781
Article Google Scholar
Harwood D, Subbarao M, Hakalahti H, Davis LS (1987) A new class of edge-preserving smoothing filters. Pattern Recogn Lett 6:155–162. doi:10.1016/0167-8655(87)90002-X
Article Google Scholar
Justino E, Oliveira LS, Freitas C (2006) Reconstructing shredded documents through feature matching. Forensic Sci Int 160:140–147. doi:10.1016/j.forsciint.2005.09.001
Article Google Scholar
Li P, Fang X, Pan L, Piao Y, Jiao M (2014) Reconstruction of shredded paper documents by feature matching. Math Probl Eng 2014:514748. doi:10.1155/2014/514748
Google Scholar
Lin HN (2009) The study of the length and width’s proportion of Chinese characters. Masters Thesis. National Chiao Tung University
Lin H, Fan-Chiang W (2012) Reconstruction of shredded document based on image feature matching. Expert Syst Appl 39:3324–3332. doi:10.1016/j.eswa.2011.09.019
Article Google Scholar
Ng H (2006) Automatic thresholding for defect detection. Pattern Recogn Lett 27:1644–1649. doi:10.1016/j.patrec.2006.03.009
Article Google Scholar
Pan G (2006) Research on the Chinese punctuation since the twentieth Century. PhD Thesis. Central China Normal University
Perl J, Diem M, Kleber F, Sablatnig R (2011) Strip shredded document reconstruction using optical character recognition. In: Proceedings of the 4th international conference on imaging for crime detection and prevention. IET, London, pp. 35–41. doi:10.1049/ic.2011.0132
Google Scholar
Pimenta A, Justino E, Oliveira LS, Sabourin R (2009) Document reconstruction using dynamic programing. In: Proc. IEEE international conference on acoustics, speech, and signal processing. IEEE, Los Alamitos, CA. pp 1393–1396. doi:10.1109/ICASSP.2009.4959853
Rumelhart DE, McClelland JL, the PDP Research Group (1986) Parallel distributed processing: explorations in the microstructure of cognition, vol 1: Foundations. MIT Press, Cambridge
Google Scholar
Skeoch A (2006) An investigation into automated shredded document reconstruction using heuristic search algorithms. PhD Thesis. University of Bath, UK
Google Scholar
Tsamoura E, Pitas I (2010) Automatic color based reassembly of fragmented images and paintings. IEEE Trans Image Process 19:680–690. doi:10.1109/TIP.2009.2035840
Article MathSciNet Google Scholar
Ukovich A (2007) Image processiong for security applications: document reconstruction and video enhancement. PhD Thesis. University of Trieste
Ukovich A, Ramponi G (2008) Feature extraction and clustering for the computer-aided reconstruction of strip-cut shredded documents. J Electron Imaging 17:013008–013013. doi:10.1117/1.2898551
Article Google Scholar
Ukovich A, Ramponi G, Doulaverakis H, Kompatsiaris Y, Strintzis MG (2004) Shredded document reconstruction using MPEG-7 standard descriptors. In: Proceedings of the 4th IEEE international symposium on signal processing and information technology. IEEE, Los Alamitos, CA, pp 334–337. doi:10.1109/ISSPIT.2004.1433788
Zhang H, Lai JK, Bächer M (2012) Hallucination: A mixed-initiative approach for efficient document reconstruction. In: Proceedings of the Workshops at the 26th AAAI Conference on Artificial Intelligence, pp 121–128
Zhao B, Zhou Y, Zhang Z, Na Y, Ma T (2014) Information quantity based automatic reconstruction of shredded Chinese documents. Proc IEEE 26th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, Los Alamitos, CA, pp 1016–1020
Zhu L, Zhou Z, Hu D (2008) Globally consistent reconstruction of ripped-up documents. IEEE Trans Pattern Anal Mach Intell 30:1–13. doi:10.1109/TPAMI.2007.1163
Article Google Scholar

Download references

Acknowledgments

Support for this program is provided by Xidian University. Additional support has been provided by Xi’an University of Technology.

I thank Professor Hong Zhu for her valuable suggestions and Dr. Pei Liu for his comments on the manuscript. I would also like to thank Yi Zhou and Jing Zhang for their technical assistance in the experiments.

Author information

Authors and Affiliations

School of Physics and Optoelectronic Engineering, Xidian University, Xi’an, 710071, China
Nan Xing & Jianqi Zhang

Authors

Nan Xing
View author publications
You can also search for this author inPubMed Google Scholar
Jianqi Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Nan Xing.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xing, N., Zhang, J. Graphical-character-based shredded Chinese document reconstruction. Multimed Tools Appl 76, 12871–12891 (2017). https://doi.org/10.1007/s11042-016-3685-7

Download citation

Received: 23 August 2015
Revised: 06 April 2016
Accepted: 14 June 2016
Published: 01 July 2016
Issue Date: May 2017
DOI: https://doi.org/10.1007/s11042-016-3685-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Graphical-character-based shredded Chinese document reconstruction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Semi-automatic Solution Archive for Cross-Cut Shredded Text Documents Reconstruction

A reconstruction method for cross-cut shredded documents based on the extreme learning machine algorithm

A high splicing accuracy solution to reconstruction of cross-cut shredded text document problem

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now