Abstract
Visual Place Recognition (VPR) is a challenging task in Visual Simultaneous Localization and Mapping (VSLAM), which expects to find out paired images corresponding to the same place in different conditions. Although most methods based on Convolutional Neural Network (CNN) perform well, they require a large number of annotated images for supervised training, which is time and energy consuming. Thus, to train the CNN in an unsupervised way and achieve better performance, we propose a new place recognition method in this paper. We design a VGG16-based Convolutional Autoencoder (VGG-CAE), which uses the features outputted by VGG16 as the label of images. In this case, VGG-CAE learns the latent representation from the label of images and improves the robustness against appearance and viewpoint variation. When deploying VGG-CAE, features are extracted from query images and reference images with post-processing, the Cosine similarities of features are calculated respectively and a matrix for feature matching is formed accordingly. To verify the performance of our method, we conducted experiments with several public datasets, showing our method achieves competitive results comparing to existing approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Glover, A.: Gardens point walking (2014). https://wiki.qut.edu.au/display/raq/day+and+night+with+lateral+pose+change+datasets
Chollet, F., et al.: Keras (2015). https://keras.io/
ImageNet, an image database organized according to the wordnet hierarchy. http://www.image-net.org/
Babenko, A., Lempitsky, V.: Aggregating local deep features for image retrieval. In: IEEE International Conference on Computer Vision (ICCV), pp. 1269–1277 (2015)
Bampis, L., Amanatiadis, A., Gasteratos, A.: Fast loop-closure detection using visual-word-vectors from image sequences. Int. J. Robot. Res. (IJRR) 37(1), 62–82 (2018)
Camara, L.G., Gäbert, C., Přeučil, L.: Highly robust visual place recognition through spatial matching of CNN features. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3748–3755 (2020)
Chen, Z., et al.: Deep learning features at scale for visual place recognition. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3223–3230 (2017)
Chen, Z., Lam, O., Jacobson, A., Milford, M.: Convolutional neural network-based place recognition. arXiv preprint arXiv:1411.1509 (2014)
Chen, Z., Maffra, F., Sa, I., Chli, M.: Only look once, mining distinctive landmarks from convnet for visual place recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9–16 (2017)
Cummins, M., Newman, P.: FAB-MAP: probabilistic localization and mapping in the space of appearance. Int. J. Robot. Res. (IJRR) 27(6), 647–665 (2008)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)
Gálvez-López, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. (TRO) 28(5), 1188–1197 (2012)
Gao, X., Zhang, T.: Unsupervised learning to detect loops using deep neural networks for visual SLAM system. Auton. Robot. 41(1), 1–18 (2015). https://doi.org/10.1007/s10514-015-9516-2
Hausler, S., Jacobson, A., Milford, M.: Feature map filtering: improving visual place recognition with convolutional calibration. arXiv preprint arXiv:1810.12465 (2018)
Hou, Y., Zhang, H., Zhou, S., Zou, H.: Use of roadway scene semantic information and geometry-preserving landmark pairs to improve visual place recognition in changing environments. IEEE Access 5, 7702–7713 (2017)
Kenshimov, C., Bampis, L., Amirgaliyev, B., Arslanov, M., Gasteratos, A.: Deep learning features exception for cross-season visual place recognition. Pattern Recognit. Lett. (PRL) 100, 124–130 (2017)
Khaliq, A., Ehsan, S., Chen, Z., Milford, M., McDonald-Maier, K.: A holistic visual place recognition approach using lightweight CNNs for significant viewpoint and appearance changes. IEEE Trans. Robot. (TRO) 36(2), 561–569 (2019)
Labbe, M., Michaud, F.: Online global loop closure detection for large-scale multi-session graph-based SLAM. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 2661–2666 (2014)
Liu, L., Shen, C., van den Hengel, A.: The treasure beneath convolutional layers: cross-convolutional-layer pooling for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4749–4757 (2015)
Liu, Y., Xiang, R., Zhang, Q., Ren, Z., Cheng, J.: Loop closure detection based on improved hybrid deep learning architecture. In: IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS), pp. 312–317 (2019)
Lopez-Antequera, M., Gomez-Ojeda, R., Petkov, N., Gonzalez-Jimenez, J.: Appearance-invariant place recognition by discriminatively training a convolutional neural network. Pattern Recogn. Lett. (PRL) 92, 89–95 (2017)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60(2), 91–110 (2004)
Lowry, S., et al.: Visual place recognition: a survey. IEEE Trans. Robot. (TRO) 32(1), 1–19 (2015)
Maffra, F., Teixeira, L., Chen, Z., Chli, M.: Real-time wide-baseline place recognition using depth completion. IEEE Robot. Autom. Lett. (RAL) 4(2), 1525–1532 (2019)
Merrill, N., Huang, G.: Lightweight unsupervised deep loop closure. arXiv preprint arXiv:1805.07703 (2018)
Milford, M.J., Wyeth, G.F.: SeqSLAM: visual route-based navigation for sunny summer days and stormy winter nights. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1643–1649 (2012)
Naseer, T., Ruhnke, M., Stachniss, C., Spinello, L., Burgard, W.: Robust visual SLAM across seasons. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 2529–2535 (2015)
Olid, D., Fácil, J.M., Civera, J.: Single-view place recognition under seasonal changes. arXiv preprint arXiv:1808.06516 (2018)
Pepperell, E., Corke, P.I., Milford, M.J.: All-environment visual place recognition with smart. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1612–1618 (2014)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571 (2011)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations, pp. 1–14 (2015)
Sünderhauf, N., Shirazi, S., Dayoub, F., Upcroft, B., Milford, M.: On the performance of convnet features for place recognition. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 4297–4304 (2015)
Sünderhauf, N., et al.: Place recognition with convnet landmarks: viewpoint-robust, condition-robust, training-free. Robot. Sci. Syst. (RSS) XI, 1–10 (2015)
Tomită, M.A., Zaffar, M., Milford, M., McDonald-Maier, K., Ehsan, S.: Convsequential-slam: a sequence-based, training-less visual place recognition technique for changing environments. arXiv preprint arXiv:2009.13454 (2020)
Xiang, R., Liu, Y., Zhang, Q., Cheng, J.: Spatial pyramid pooling based convolutional autoencoder network for loop closure detection. In: IEEE International Conference on Real-time Computing and Robotics (RCAR), pp. 714–719 (2019)
Zaffar, M., Ehsan, S., Milford, M., McDonald-Maier, K.: CoHOG: a light-weight, compute-efficient, and training-free visual place recognition technique for changing environments. IEEE Robot. Autom. Lett. 5(2), 1835–1842 (2020)
Zaffar, M., Khaliq, A., Ehsan, S., Milford, M., Alexis, K., McDonald-Maier, K.: Are state-of-the-art visual place recognition techniques any good for aerial robotics? arXiv preprint arXiv:1904.07967 (2019)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 40(6), 1452–1464 (2017)
Acknowledgments
This work was supported by the National Natural Science Foundation of China (nos. U1913202, U1813205, U1713213, 61772508), CAS Key Technology Talent Program, Shenzhen Technology Project (nos. JCYJ20180507182610734, JSGG20191129094012321)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Xu, Z., Zhang, Q., Hao, F., Ren, Z., Kang, Y., Cheng, J. (2021). VGG-CAE: Unsupervised Visual Place Recognition Using VGG16-Based Convolutional Autoencoder. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13020. Springer, Cham. https://doi.org/10.1007/978-3-030-88007-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-88007-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88006-4
Online ISBN: 978-3-030-88007-1
eBook Packages: Computer ScienceComputer Science (R0)