Adaptive Script-Independent Text Line Extraction
IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Adaptive Script-Independent Text Line Extraction
Majid ZIARATBANKarim FAEZ
Author information
JOURNAL FREE ACCESS

2011 Volume E94.D Issue 4 Pages 866-877

Details
Abstract

In this paper, an adaptive block-based text line extraction algorithm is proposed. Three global and two local parameters are defined to adapt the method to various handwritings in different languages. A document image is segmented into several overlapping blocks. The skew of each block is estimated. Text block is de-skewed by using the estimated skew angle. Text regions are detected in the de-skewed text block. A number of data points are extracted from the detected text regions in each block. These data points are used to estimate the paths of text lines. By thinning the background of the image including text line paths, text line boundaries or separators are estimated. Furthermore, an algorithm is proposed to assign to the extracted text lines the connected components which have intersections with the estimated separators. Extensive experiments on different standard datasets in various languages demonstrate that the proposed algorithm outperforms previous methods.

Content from these authors
© 2011 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top