Abstract
We address the problem of large scale place-of-interest recognition in cell phone images of urban scenarios. Here, we go beyond what has been shown in earlier approaches by exploiting the nowadays often available 3D building information (e.g. from extruded floor plans) and massive street-view like image data for database creation. Exploiting vanishing points in query images and thus fully removing 3D rotation from the recognition problem allows then to simplify the feature invariance to a pure homothetic problem, which we show leaves more discriminative power in feature descriptors than classical SIFT. We rerank visual word based document queries using a fast stratified homothetic verification that is tailored for repetitive patterns like window grids on facades and in most cases boosts the correct document to top positions if it was in the short list. Since we exploit 3D building information, the approach finally outputs the camera pose in real world coordinates ready for augmenting the cell phone image with virtual 3D information. The whole system is demonstrated to outperform traditional approaches on city scale experiments for different sources of street-view like image data and a challenging set of cell phone images.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Schindler, G., Brown, M., Szeliski, R.: City-Scale Location Recognition. In: CVPR 2007 (2007)
Wu, C., Fraundorfer, F., Frahm, J.-M., Pollefeys, M.: 3D model search and pose estimation from single images using VIP features. In: Workshop on Search in 3D, CVPR 2008 (2008)
Robertson, D., Cipolla, R.: An image based system for urban navigation. In: BMVC 2004 (2004)
Sivic, J., Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. In: ICCV 2003 (2003)
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: CVPR 2006 (2006)
Irschara, A., Zach, C., Frahm, J.-M., Bischof, H.: From structure-from-motion point clouds to fast location recognition. In: CVPR 2009 (2009)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(10) (2005)
Zhang, W., Kosecka, J.: Image based localization in urban environments. In: 3DPVT 2006 (2006)
Zhu, Z., Oskiper, T., Samarasekera, S., Kumar, R., Sawhney, H.S.: Real-time global localization with a pre-built visual landmark database. In: CVPR 2008 (2008)
Cao, Y., McDonald, J.: Viewpoint Invariant Features from Single Images using 3D Geometry. In: IEEE Workshop on Applications of Computer Vision 2009 (2009)
Bay, H., Ess, A., Tuytelaars, T., van Gool, L.: SURF: Speeded Up Robust Features. Computer Vision and Image Understanding 110(3) (2008)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2) (2004)
Köser, K., Koch, R.: Perspectively Invariant Normal Features. In: Workshop on 3D Representation for Recognition, ICCV 2007 (2007)
Wu, C., Clipp, B., Li, X., Frahm, J.-M., Pollefeys, M.: 3D Model Matching with Viewpoint Invariant Patches (VIPs). In: CVPR 2008 (2008)
Dreuw, P., Steingrube, P., Hanselmann, H., Ney, H.: SURF-Face: Face Recognition Under Viewpoint Consistency Constraints. In: BMVC 2009 (2009)
Jegou, H., Douze, M., Schmid, C.: Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object Retrieval with Large Vocabularies and Fast Spatial Matching. In: CVPR 2007 (2007)
Perdoch, M., Chum, O., Matas, J.: Efficient Representation of Local Geometry for Large Scale Object Retrieval. In: CVPR 2009 (2009)
Kosecka, J., Zhang, W.: Video Compass. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 476–490. Springer, Heidelberg (2002)
Bishop, C.M.: Pattern Recognition and Machine Learning, p. 123, Section 2.5.1 (2006) ISBN 0-387-31073-8
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Baatz, G., Köser, K., Chen, D., Grzeszczuk, R., Pollefeys, M. (2010). Handling Urban Location Recognition as a 2D Homothetic Problem. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15567-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-15567-3_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15566-6
Online ISBN: 978-3-642-15567-3
eBook Packages: Computer ScienceComputer Science (R0)