Building cameras for capturing documents | International Journal on Document Analysis and Recognition (IJDAR) Skip to main content
Log in

Abstract.

This paper explores those aspects of document capture that are specific to cameras. Each of them must be addressed in order to close the gap between taking a photograph of a document and capturing the document itself. We present results in five areas: (1) framing documents using structured light, (2) robustly dealing with ambient illumination when capturing glossy documents, (3) improving text quality when using mosaiced color sensors, (4) robustly and passively recovering perspective and image plane skew using text flow, and (5) measuring and undoing page curl using structured light and an applicable surface model. The ultimate success of subsequent document recognition will be heavily dependent on the successful completion of these tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ishii H, Kobayash M, Arita K, Yagi T (1997) Iterative design of collaboration media. In: Finn KE, Sellen AJ, Wilber SB (eds) Video-mediated communication, Chap 21. Erlbaum, Mahwah, NJ

  2. Brown BAT, Sellen AJ, O’Hara KP (2000) A diary study of information capture in working life. In: Proceedings of CHI 2000, The Hague, The Netherlands, pp 438-445

  3. Pollard SB, Pilu M, Goris AC (2000) Framing aid for a document capture device. European Patent Application EP1128655

  4. Soifer VA, Golub MA (1994) Laser beam mode selection by computer generated holograms. CRC Press, Boca Raton, FL

  5. Frost P, Pollard S, Pilu M (1999) Framing aids to support document capture using digital cameras: a user study. HP Labs Technical Report HPL-99-146

  6. Judd DB (1937) Gloss and glossiness. Am Dyest Rep 26:234-235

    Google Scholar 

  7. Foley J, vanDam AM, Feiner S, Hughes J (1990) Computer graphics: principles and practice. Addison Wesley, Reading, MA

  8. Pollard SB, Pilu M (2000) Practical modelling of specularity from strobes in close-up imaging. HP Labs Technical Report HPL-2000-150

  9. Pollard SB, Pilu M (2002) Digital cameras. European Patent Application EP1233606

  10. Adams JE (1997) Design of practical color filter array interpolation algorithms for digital cameras. In: Proceedings of SPIE Real Time Imaging II, 3028:117-125

  11. Hunter AA, Pollard SB (2002) Image mosaic data reconstruction. US Patent Application 09/906, 786

  12. Gonzalez RC (1992) Digital image processing. Addison Wesley, Reading, MA, pp 196-197

  13. Haralick RM (1989) Monocular vision using inverse perspective projection geometry: analytic relations. In: CVPR, pp 370-378

  14. Taylor MJ, Zappala A, Newman WM, Dance CR (1999) Documents through cameras. Image Vis Comput 17(11):831-844

    Google Scholar 

  15. Nakano Y, Shima Y, Fujisawa H, Higashino J, Fojinawa M (1990) An algorithm for the skew normalization of document images. In: ICPR, 2:8-13

  16. Hashizume A, Yeh PS, Rosenfeld A (1986) A method of detecting the orientation of aligned components. Pattern Recog Lett 4:125-132

    Google Scholar 

  17. Messelodi S, Modena CM (1999) Automatic identification and skew estimation of text lines in real scene images. Pattern Recog 32:791-810

    Google Scholar 

  18. Coughlan JM, Yuille AL (1999) Manhattan world: compass direction from single image by Bayesian inference. In: International conference on computer vision, pp 941-947

  19. Kwon JS, Hong HK, Choi JS (1996) Obtaining a 3D orientation of projective textures using a morphological method. Pattern Recog 29:725-732

    Google Scholar 

  20. Clark P, Mirmhedi M (2000) Location and recovery of text on oriented surfaces. SPIE conference on electronic imaging: document recognition and retrieval VII, January 2000

  21. Clark P, Mirmehdi M (2003) Rectifying perspective views of text in 3D scenes using vanishing points. Pattern Recog 36(11):2673-2686

    Google Scholar 

  22. Pilu M (2001) Extraction of illusory linear clues in perspectively skewed documents. In: CVPR, December 2001

  23. Pilu M (2001) Perspective deskewing of documents from linear clues. HP Labs Technical Report HPL-2001-6, January 2001

  24. Pilu M (2002) Document capture. US Patent Application US20020149808 A1

  25. Bruce V, Green PR (1991) Visual perception, 2nd edn. Psychology Press, East Sussex, UK

  26. Pilu M, Pollard S (2002) A light-weight text image processing method for handheld embedded cameras. In: British Machine Vision Conference, September 2002

  27. Haralich R, Shapiro L (1992) Computer and robot vision. Addison Wesley, Reading, MA

  28. Fischler MA, Bolles RC (1981) A RANSAC-based approach to model fitting and its application to finding cylinders in range data. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp 637-643

  29. Hartley RI (1999) Theory and practice of projective rectification. Int J Comput Vis 35(2):1-16

    Google Scholar 

  30. Pilu M (1998) Page curl recovery with structured light. HP Labs Technical Report HPL-98-174, October 1998

  31. Pilu M (2000) Document imaging system. European Patent Application EP00946058

  32. Pilu M (2002) Undoing page curl using applicable surfaces. : In: CVPR, Kauai, HI, December 2001

  33. Wang YF, Aggarwal JK (1998) An overview of geometric modeling using active sensing. IEEE Control Syst Mag 8(3):5-13

    Google Scholar 

  34. Besl PJ, Jain RC (1985) Three-dimensional object recognition. Comput Surv 17(1):75-145

    Google Scholar 

  35. Xerox Corp (1998) Platenless book scanning system with a general imaging geometry. US Patent 5,760,925, June 1998

  36. Xerox Corp (1998) Platenless book scanner with line buffering to compensate for image skew. US Patent 5,764,383, June 1998

  37. Minolta Camera Kabushiki Kaisha (1992) Document reading apparatus for detection of curvature in documents. US Patent 5,084,611, January 1992

  38. Ng HN, Grimsdale L (1996) Computer graphic techniques for modeling cloth. IEEE Comput Graph Appl 16(5):28-45

    Google Scholar 

  39. Ma SD, Lin H (1998) Optimal texture mapping. In: Eurographics. Elsevier, Amsterdam

  40. Do Carmo MP (1976) Differential geometry of curves and surfaces. Prentice-Hall, Upper Saddle River, NJ

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen Pollard.

Additional information

Received: 8 December 2003, Revised: 6 April 2004, Published online: 11 March 2005

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pollard, S., Pilu, M. Building cameras for capturing documents. IJDAR 7, 123–137 (2005). https://doi.org/10.1007/s10032-004-0129-0

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-004-0129-0

Keywords

Navigation