Abstract
In this paper, we address the issue of discerning handwriting from machine-printed text in real documents (This work is funded by the PiXL project, supported by the “Fonds national pour laSociété Numérique” of the French State. http://valconum.fr/index.php/les-projets/pixl). We present a reliable method based on a novel set of features belonging to two different categories, linearity and regularity, invariant to translation and scaling. Specifically, a novel linearity measure derived from the histogram of straight line segment lengths is introduced. The resulting framework is independent of the document layout andsupports any latin language used. Its performances are assessed on real documents dataset comprising heterogeneous administrative images.Experimental results demonstrate its accuracy, allowing up to 90 % recognition rate.
The authors would like to thank ITESOFT society for providing the dataset and for their help to carry out the comparison with Belaid et al. method [1].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Belaïd, A., Santosh, K.C., D’Andecy, V.P.: Handwritten and printed text separation in real document. CoRR, abs/1303.4614 (2013)
Zagoris, K., Pratikakis, I., Antonacopoulos, A., Gatos, B., Papamarkos, N.: Handwritten and machine printed text separation in document images using the bag of visual words. In: International Conference on Frontiers in Handwriting Recognition (2012)
Peng, X., Setlur, S., Govindaraju, V., Sitaram, R.: Handwritten text separation from annotated machine printed documents using markov random fields. IJDAR 16(1), 1–16 (2013)
Wahl, R., Wong, K., Casey, R.: Block Segmentation and Text Extraction in Mixed Text/Image Documents. IBM Research Lab, San Jose, California, Research Report RJ3356 (40312) (December 1981)
Zheng, Y., Li, H., Doermann, D.: Machine printed text and handwriting identification in noisy document images. University of Maryland, College Park, Technical Report (September 2003)
Shirdhonkar, M., Kokare, M.B.: Discrimination between printed and handwritten text in documents. IJCA 3, 131–134 (2010). Special Issue on RTIPPR
Bilane, P., Bres, S., Emptoz, H.: Robust directional features for wordspotting in degraded syriac manuscripts. In: International Workshop on Content-Based Multimedia Indexing, CBMI 2008, pp. 526–533 (June 2008)
Berlemont, S., Aaron, B., Cloppet, F., Olivo-Marin, J.-C.: Detection of linear structures in biological images. In: Conference Record of the Forty-First Asilomar, Signals, Systems and Computers 2007, pp. 1279–1283 (November 2007)
Siddiqi, I., Vincent, N.: Text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features. Pattern Recognition 43(11), 3853–3865 (2010)
Wall, K., Danielsson, P.-E.: A fast sequential method for polygonal approximation of digitized curves. Computer Vision Graphics and Image Processing 28(3), 220–227 (1984)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Hamrouni, S., Cloppet, F., Vincent, N. (2014). Handwritten and Printed Text Separation: Linearity and Regularity Assessment. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2014. Lecture Notes in Computer Science(), vol 8814. Springer, Cham. https://doi.org/10.1007/978-3-319-11758-4_42
Download citation
DOI: https://doi.org/10.1007/978-3-319-11758-4_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11757-7
Online ISBN: 978-3-319-11758-4
eBook Packages: Computer ScienceComputer Science (R0)