Abstract
Three-dimensional (3-D) models of outdoor scenes are widely used for object recognition, navigation, mixed reality, and so on. Because such models are often made manually with high costs, automatic 3-D reconstruction has been widely investigated. In related work, a dense 3-D model is generated by using a stereo method. However, such approaches cannot use several hundreds images together for dense depth estimation because it is difficult to accurately calibrate a large number of cameras. In this paper, we propose a dense 3-D reconstruction method that first estimates extrinsic camera parameters of a hand-held video camera, and then reconstructs a dense 3-D model of a scene. In the first process, extrinsic camera parameters are estimated by tracking a small number of predefined markers of known 3-D positions and natural features automatically. Then, several hundreds dense depth maps obtained by multi-baseline stereo are combined together in a voxel space.So, we can acquire a dense 3-D model of the outdoor scene accurately by using several hundreds input images captured by a hand-held video camera.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Barnard, S.T. and Fischler, M.A. 1982. Computational stereo. ACM Computing Surveys, 14(4):553–572.
Beardsley, P., Zisserman, A., and Murray, D. 1997. Sequential updating of projective and affine structure from motion. Int. Jour. of Computer Vision, 23(3):235–259.
Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Proc. Alvey Vision Conf., pp. 147–151.
Kumar, R., Sawhney, H.S., Guo, Y., Hsu, S., and Samarasekera, S. 2000. 3D manipulation of motion imagery. In Proc. Int. Conf. on Image Processing, pp. 17–20.
Morris, D.D. and Kanade, T. 1998. A unified factorization algorithm for points, lines segments and planes with uncertainty models. In Proc. 6th Int. Conf. on Computer Vision, pp. 696–702.
Ohta, Y. and Kanade, T. 1985. Stereo by intra-and inter-scanline search using dynamic programming. IEEE Trans. Pattern Analysis and Machine Intelligence, PAMI-7(2):139–154.
Okutomi, M. and Kanade, T. 1993. A multiple-baseline stereo. IEEE Trans. Pattern Analysis and Machine Intelligence, 15(4):353–363.
Poleman, J. and Kanade, T. 1993. A paraperspective factorization method for shape and motion recovery. Technical Report CMU-CS-93-219, Carnegie-Mellon University.
Pollefeys, M., Koch, R., Vergauwen, M., Deknuydt, A.A., and Gool, L.J.V. 2000. Three-dimentional scene reconstruction from images. In Proc. SPIE, vol. 3958, pp. 215–226.
Roth, G. and Whitehead, A. 2000. Using projective vision to find camera positions in an image sequence. In Proc. 13th Int. Conf. on Vision Interface, pp. 87–94.
Sato, T., Kanbara, M., Takemura, H., and Yokoya, N. 2001. 3-D reconstruction from a monocular image sequence by tracking markers and natural features. In Proc. 14th Int. Conf. on Vision Interface, pp. 157–164.
Sawhney, H.S., Guo, Y., Asmuth, J., and Kumar, R. 1999. Multi-view 3D estimation and application to match move. In Proc. IEEE Workshop on Multi-view Modeling and Analysis of Visual Scenes, pp. 21–28.
Schmid, C., Mohr, R., and Bauckhage, C. 1998. Comparing and evaluating interest points. In Proc. 6th Int. Conf. on Computer Vision, pp. 230–235.
Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. Int. Journal of Computer Vision, 9(2):137–154.
Tsai, R.Y. 1986. An efficient and accurate camera calibration technique for 3D machine vision. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 364–374.
Yokoya, N. 1992. Surface reconstruction directly from binocular stereo images by multiscale-multistage regularization. In Proc. 11th Int. Conf. on Pattern Recognition, vol. I, pp. 642–646.
Yokoya, N., Shakunaga, T., and Kanbara, M. 1999. Passive range sensing techniques: Depth from images. IEICE Trans. Inf. and Syst., E82-D(3):523–533.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Sato, T., Kanbara, M., Yokoya, N. et al. Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-Baseline Stereo Using a Hand-Held Video Camera. International Journal of Computer Vision 47, 119–129 (2002). https://doi.org/10.1023/A:1014537706773
Issue Date:
DOI: https://doi.org/10.1023/A:1014537706773