Abstract
Understanding printed documents such as newspapers is a common intelligent activity of humans. Making a computer perform the task of analyzing a newspaper image and derive useful high-level representations requires the development and integration of techniques in several areas, including pattern recognition, computer vision, language understanding and artificial intelligence. We describe the organization and several components of a newspaper image undertanding system that begins with digitized images of newspaper pages and produces symbolic representations at several different levels. Such representations include: the visual sketch (connected components extracted from the background), physical layout (spatial extents of blocks corresponding to text, half-tones, graphics), logical layout (organization of story components), block primitives (e.g., recognized characters and words in text blocks, lines in graphics, faces in photographs, etc.), and semantic nets corresponding to photographic and textual blocks (individually, as well as grouped together as stories). We describe algorithms for deriving several of the representations and describe the interaction of different modules.
This work was supported by the National Science Foundation grant IRI-86-13361 and by a grant from the Eastman Kodak Company.
Preview
Unable to display preview. Download preview PDF.
References
H. Baird. Feature identification for hybrid structural/statistical pattern classification. Computer Vision Graphics, and Image Processing, 42:318–333, 1988.
R.O. Duda and P.E. Hart. Use of the Hough transform to detect lines and curves in pictures. Communications of the ACM, 15:11–15, 1972.
M.M. Galloway. Texture Analysis Using Gray Level Run Lengths. Computer Graphics and Image Processing, 4:172–179, 1975.
V. Govindaraju, D.B. Sher, R.K. Srihari,, and S.N. Srihari. Locating Human Faces in Newspaper Photographs. In IEEE conference on Computer Vision and Pattern Recognition, pages 549–555, 1989.
D. Hoffman and W. Richards. Parts of Recognition, pages 268–293. Ablex Publishing Corporation.
G. Nagy, S.C. Seth, and S.D. Stoddard. Document analysis with an expert system. In Proceedings of Pattern Recognition in Practice II, Amsterdam, 1985.
T. Pavlidis. A vectorizer and feature extractor for document recognition. Computer Vision, Graphics, and Image Processing, 35:111–127, 1986.
S.C. Shapiro and W.J. Rapaport. Sneps Considered as a Fully Intensional Propositional Semantic Network. In Nick Cercone and Gordon McCalla, editors, The Knowledge Frontier, Essays in the Representation of Knowledge, Springer-Verlag, New York, 1987.
R.K. Srihari and W.J. Rapaport. Extracting Visual Information From Text: Using Caption to Label Human Faces in Newspaper Photographs. In Proceedings of the 11th Annual Conference of the Cognitive Science Society, pages 364–371, Ann Arbor, MI, 1989.
D. Wang and S.N. Srihari. Classification of Newspaper Blocks Using Texture Analysis. Computer Vision, Graphics, and Image Processing, 47:327–352, 1989.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1990 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Govindaraju, V. et al. (1990). Newspaper image understanding. In: Ramani, S., Chandrasekar, R., Anjaneyulu, K.S.R. (eds) Knowledge Based Computer Systems. KBCS 1989. Lecture Notes in Computer Science, vol 444. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0018395
Download citation
DOI: https://doi.org/10.1007/BFb0018395
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-52850-0
Online ISBN: 978-3-540-47168-4
eBook Packages: Springer Book Archive