Abstract
This paper introduces a multi-level classification framework for the semantic annotation of urban maps as provided by a mobile robot. Environmental cues are considered for classification at different scales. The first stage considers local scene properties using a probabilistic bag-of-words classifier. The second stage incorporates contextual information across a given scene (spatial context) and across several consecutive scenes (temporal context) via a Markov Random Field (MRF). Our approach is driven by data from an onboard camera and 3D laser scanner and uses a combination of visual and geometric features. By framing the classification exercise probabilistically we take advantage of an information-theoretic bail-out policy when evaluating class-conditional likelihoods. This efficiency, combined with low order MRFs resulting from our two-stage approach, allows us to generate scene labels at speeds suitable for online deployment. We demonstrate the virtue of considering such spatial and temporal context during the classification task and analyze the performance of our technique on data gathered over almost 17 km of track through a city.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Anguelov, D., Koller, D., Parker, E., & Thrun, S. (2004). Detecting and modeling doors with mobile robots. In Proc. of the IEEE int. conference on robotics and automation (ICRA).
Anguelov, D., Taskar, B., Chatalbashev, V., Koller, D., Gupta, D., Heitz, G., & Ng, A. Y. (2005). Discriminative learning of Markov random fields for segmentation of 3D scan data. In CVPR (2) (pp. 169–176). Los Alamitos: IEEE Computer Society.
Bennett, G. (1962). Probability inequalities for the sum of independent random variables. Journal of the American Statistical Association, 57, 33–45.
Boucheron, S., Lugosi, G., & Bousquet, O. (2004). In Lecture notes in artificial intelligence Vol. 3176. Concentration inequalities, (pp. 208–240). Springer: Heidelberg.
Chow, C. K., & Liu, C. N. (1968). Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, IT-14(3).
Cornelis, N., Leibe, B., Cornelis, K., & Van Gool, L. (2006). 3D city modeling using cognitive loops. In Proc. of the third int. symposium on 3D data processing, visualization, and transmission (3DPVT’06).
Cummins, M., & Newman, P. (2008a). Accelerated appearance-only SLAM. In Proc. IEEE international conference on robotics and automation (ICRA’08), Pasadena, California.
Cummins, M., & Newman, P. (2008b). FAB-MAP: Probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research, 27(6), 647–665.
Cummins, M., & Newman, P. (2008c). FAB-MAP: Probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research, 27(6), 647–665.
Douillard, B., Fox, D., & Ramos, F. T. (2007). A spatio-temporal probabilistic model for multi-sensor object recognition. In Proc. of IEEE/RSJ int. conference on intelligent robots and systems (IROS).
Douillard, B., Fox, D., & Ramos, F. T. (2008). Laser and vision based outdoor object mapping. In Proc. of robotics: science and systems.
Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification (2nd ed.) New York: Wiley-Interscience.
Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.
Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29(2), 131–163.
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6).
Gould, S., Rodgers, J., Cohen, D., Elidan, G., & Koller, D. (2008). Multi-class segmentation with relative location prior. International Journal of Computer Vision, 80(3), 300–316.
Hadsell, R., Sermanet, P., Ben, J., Erkan, A., Han, J., Muller, U., & LeCun, Y. (2007). Online learning for offroad robots: spatial label propagation to learn long-range traversability. In Proc. of robotics: science and systems.
Happold, M., Ollis, M., & Johnson, N. (2006). Enhancing supervised terrain classification with predictive unsupervised learning. In Proc. of robotics: science and systems.
Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301), 13–30.
Hoiem, D., Efros, A. A., & Hebert, M. (2006). Putting objects in perspective. In Proc. IEEE computer vision and pattern recognition (CVPR).
Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.
Leung, T., & Malik, J. (2001). Representing and recognizing the visual appearance of materials using three-dimensional textons. International Journal of Computer Vision, 43(1), 29–44.
Limketkai, B., Liao, L., & Fox, D. (2005). Relational object maps for mobile robots. In L. P. Kaelbling & A. Saffiotti (Eds.), IJCAI (pp. 1471–1476). Singapore: Professional Book Center.
Maron, O., & Moore, A. W. (1994). Hoeffding races: Accelerating model selection search for classification and function approximation. In J. D. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in neural information processing systems (Vol. 6, pp. 59–66). Los Altos: Morgan Kaufmann.
Martínez-Mozos, O., Stachniss, C., & Burgard, W. (2005). Supervised learning of places from range data using adaboost. In Proc. of the int. conference on robotics and automation (ICRA) (pp. 1742–1747).
Matas, J., & Chum, O. (2005). Randomized RANSAC with sequential probability ratio test. In S. Ma & H.-Y. Shum (Eds.), Proc. IEEE international conference on computer vision (ICCV) (Vol. II, pp. 1727–1732), New York, USA, October, 2005. Los Alamitos: IEEE Computer Society Press.
Meilă, M., & Jordan, M. I. (2001). Learning with mixtures of trees. The Journal of Machine Learning Research, 1, 1–48.
Monteiro, G., Premebida, C., Peixoto, P., & Nunes, U. (2006). Tracking and classification of dynamic obstacles using laser range finder and vision. In Workshop on “safe navigation in open and dynamic environments—autonomous systems versus driving assistance systems” at the IEEE/RSJ int. conference on intelligent robots and systems (IROS).
Murphy, K. P., Weiss, Y., & Jordan, M. I. (1999). Loopy belief propagation for approximate inference: An empirical study. In Proc. of uncertainty in AI (pp. 467–475).
Nistér, D. (2005). Preemptive RANSAC for live structure and motion estimation. Machine Vision and Applications, 16(5), 321–329.
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. Los Altos: Morgan Kaufmann.
Ponce, J., Hebert, M., Schmid, C., & Zisserman, A. (Eds.) (2007). In Lecture notes in computer science, Vol. 4170: Toward category-level object recognition.
Pope, A. R. (1994). Model-based object recognition—a survey of recent research (Technical Report TR-94-04). The University of British Columbia.
Posner, I., Schröter, D., & Newman, P. (2006). Using scene similarity for place labelling. In Proc. of the int. symposium on experimental robotics (ISER).
Posner, I., Cummins, M., & Newman, P. (2008a). Fast probabilistic labeling of city maps. In Proc. robotics: Science and systems (RSS).
Posner, I., Schroeter, D., & Newman, P. (2008b). Online generation of scene descriptions in urban environments. Robotics Autonomous Systems, 56(11), 901–914.
Ranganathan, A., & Dellaert, F. (2007). Semantic modeling of places using objects. In Proc. of robotics: science and systems, Atlanta, GA, USA.
Schmid, C. (2001). Constructing models for content-based image retrieval. In IEEE conference on computer vision and pattern recognition (Vol. 2).
Sivic, J., & Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. In Proceedings of the international conference on computer vision, Nice, France.
Thrun, S., Montemerlo, M., Dahlkamp, H., Stavens, D., Aron, A., Diebel, J., Fong, P., Gale, J., Halpenny, M., Hoffmann, G., Lau, K., Oakley, C., Palatucci, M., Pratt, V., Stang, P., Strohband, S., Dupont, C., Jendrossek, L.-E., Koelen, C., Markey, C., Rummel, C., van Niekerk, J., Jensen, E., Alessandrini, P., Bradski, G., Davies, B., Ettinger, S., Kaehler, A., Nefian, A., & Mahoney, P. (2006). Stanley: The robot that won the DARPA grand challenge. Journal of Field Robotics, 9(23).
Torr, P., & Zisserman, A. (2000). MLESAC: A new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding, 78, 138–156.
Triebel, R., Kersting, K., & Burgard, W. (2006). Robust 3D scan point classification using associative Markov networks. In Proc. of the int. conference on robotics and automation (ICRA).
Weingarten, J., Gruener, G., & Siegwart, R. (2003). A fast and robust 3D feature extraction algorithm for structured environment reconstruction. In Proc. of the 11th int. conference on advanced robotics (ICAR).
Wellington, C., Courville, A., & Stentz, A. (2005). Interacting Markov random fields for simultaneous terrain modeling and obstacle detection. In Proc. of robotics: science and systems.
Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2001). Generalized belief propagation. In NIPS 13 (pp. 689–695). Cambridge: MIT Press.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Posner, I., Cummins, M. & Newman, P. A generative framework for fast urban labeling using spatial and temporal context. Auton Robot 26, 153–170 (2009). https://doi.org/10.1007/s10514-009-9110-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-009-9110-6