VGG-CAE: Unsupervised Visual Place Recognition Using VGG16-Based Convolutional Autoencoder

Xu, Zhenyu; Zhang, Qieshi; Hao, Fusheng; Ren, Ziliang; Kang, Yuhang; Cheng, Jun

doi:10.1007/978-3-030-88007-1_8

Zhenyu Xu^16,17,
Qieshi Zhang^16,17,18,
Fusheng Hao^16,18,
Ziliang Ren^16,18,
Yuhang Kang^16,18 &
…
Jun Cheng^16,17,18

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13020))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

2478 Accesses

Abstract

Visual Place Recognition (VPR) is a challenging task in Visual Simultaneous Localization and Mapping (VSLAM), which expects to find out paired images corresponding to the same place in different conditions. Although most methods based on Convolutional Neural Network (CNN) perform well, they require a large number of annotated images for supervised training, which is time and energy consuming. Thus, to train the CNN in an unsupervised way and achieve better performance, we propose a new place recognition method in this paper. We design a VGG16-based Convolutional Autoencoder (VGG-CAE), which uses the features outputted by VGG16 as the label of images. In this case, VGG-CAE learns the latent representation from the label of images and improves the robustness against appearance and viewpoint variation. When deploying VGG-CAE, features are extracted from query images and reference images with post-processing, the Cosine similarities of features are calculated respectively and a matrix for feature matching is formed accordingly. To verify the performance of our method, we conducted experiments with several public datasets, showing our method achieves competitive results comparing to existing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 13727; Price includes VAT (Japan)

Softcover Book: JPY 17159; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Robust Place Recognition Using Illumination-compensated Image-based Deep Convolutional Autoencoder Features

Article 24 June 2020

Design of Place Recognition Algorithm Based on VLAD Code and Convolutional Neural Network

COLCONF: Collaborative ConvNet Features-based Robust Visual Place Recognition for Varying Environments

Article 02 October 2021

References

Glover, A.: Gardens point walking (2014). https://wiki.qut.edu.au/display/raq/day+and+night+with+lateral+pose+change+datasets
Chollet, F., et al.: Keras (2015). https://keras.io/
ImageNet, an image database organized according to the wordnet hierarchy. http://www.image-net.org/
Babenko, A., Lempitsky, V.: Aggregating local deep features for image retrieval. In: IEEE International Conference on Computer Vision (ICCV), pp. 1269–1277 (2015)
Google Scholar
Bampis, L., Amanatiadis, A., Gasteratos, A.: Fast loop-closure detection using visual-word-vectors from image sequences. Int. J. Robot. Res. (IJRR) 37(1), 62–82 (2018)
Article Google Scholar
Camara, L.G., Gäbert, C., Přeučil, L.: Highly robust visual place recognition through spatial matching of CNN features. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3748–3755 (2020)
Google Scholar
Chen, Z., et al.: Deep learning features at scale for visual place recognition. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3223–3230 (2017)
Google Scholar
Chen, Z., Lam, O., Jacobson, A., Milford, M.: Convolutional neural network-based place recognition. arXiv preprint arXiv:1411.1509 (2014)
Chen, Z., Maffra, F., Sa, I., Chli, M.: Only look once, mining distinctive landmarks from convnet for visual place recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9–16 (2017)
Google Scholar
Cummins, M., Newman, P.: FAB-MAP: probabilistic localization and mapping in the space of appearance. Int. J. Robot. Res. (IJRR) 27(6), 647–665 (2008)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)
Google Scholar
Gálvez-López, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. (TRO) 28(5), 1188–1197 (2012)
Article Google Scholar
Gao, X., Zhang, T.: Unsupervised learning to detect loops using deep neural networks for visual SLAM system. Auton. Robot. 41(1), 1–18 (2015). https://doi.org/10.1007/s10514-015-9516-2
Article MathSciNet Google Scholar
Hausler, S., Jacobson, A., Milford, M.: Feature map filtering: improving visual place recognition with convolutional calibration. arXiv preprint arXiv:1810.12465 (2018)
Hou, Y., Zhang, H., Zhou, S., Zou, H.: Use of roadway scene semantic information and geometry-preserving landmark pairs to improve visual place recognition in changing environments. IEEE Access 5, 7702–7713 (2017)
Article Google Scholar
Kenshimov, C., Bampis, L., Amirgaliyev, B., Arslanov, M., Gasteratos, A.: Deep learning features exception for cross-season visual place recognition. Pattern Recognit. Lett. (PRL) 100, 124–130 (2017)
Article Google Scholar
Khaliq, A., Ehsan, S., Chen, Z., Milford, M., McDonald-Maier, K.: A holistic visual place recognition approach using lightweight CNNs for significant viewpoint and appearance changes. IEEE Trans. Robot. (TRO) 36(2), 561–569 (2019)
Article Google Scholar
Labbe, M., Michaud, F.: Online global loop closure detection for large-scale multi-session graph-based SLAM. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 2661–2666 (2014)
Google Scholar
Liu, L., Shen, C., van den Hengel, A.: The treasure beneath convolutional layers: cross-convolutional-layer pooling for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4749–4757 (2015)
Google Scholar
Liu, Y., Xiang, R., Zhang, Q., Ren, Z., Cheng, J.: Loop closure detection based on improved hybrid deep learning architecture. In: IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS), pp. 312–317 (2019)
Google Scholar
Lopez-Antequera, M., Gomez-Ojeda, R., Petkov, N., Gonzalez-Jimenez, J.: Appearance-invariant place recognition by discriminatively training a convolutional neural network. Pattern Recogn. Lett. (PRL) 92, 89–95 (2017)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60(2), 91–110 (2004)
Article Google Scholar
Lowry, S., et al.: Visual place recognition: a survey. IEEE Trans. Robot. (TRO) 32(1), 1–19 (2015)
Google Scholar
Maffra, F., Teixeira, L., Chen, Z., Chli, M.: Real-time wide-baseline place recognition using depth completion. IEEE Robot. Autom. Lett. (RAL) 4(2), 1525–1532 (2019)
Article Google Scholar
Merrill, N., Huang, G.: Lightweight unsupervised deep loop closure. arXiv preprint arXiv:1805.07703 (2018)
Milford, M.J., Wyeth, G.F.: SeqSLAM: visual route-based navigation for sunny summer days and stormy winter nights. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1643–1649 (2012)
Google Scholar
Naseer, T., Ruhnke, M., Stachniss, C., Spinello, L., Burgard, W.: Robust visual SLAM across seasons. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 2529–2535 (2015)
Google Scholar
Olid, D., Fácil, J.M., Civera, J.: Single-view place recognition under seasonal changes. arXiv preprint arXiv:1808.06516 (2018)
Pepperell, E., Corke, P.I., Milford, M.J.: All-environment visual place recognition with smart. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1612–1618 (2014)
Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571 (2011)
Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations, pp. 1–14 (2015)
Google Scholar
Sünderhauf, N., Shirazi, S., Dayoub, F., Upcroft, B., Milford, M.: On the performance of convnet features for place recognition. In: IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 4297–4304 (2015)
Google Scholar
Sünderhauf, N., et al.: Place recognition with convnet landmarks: viewpoint-robust, condition-robust, training-free. Robot. Sci. Syst. (RSS) XI, 1–10 (2015)
Google Scholar
Tomită, M.A., Zaffar, M., Milford, M., McDonald-Maier, K., Ehsan, S.: Convsequential-slam: a sequence-based, training-less visual place recognition technique for changing environments. arXiv preprint arXiv:2009.13454 (2020)
Xiang, R., Liu, Y., Zhang, Q., Cheng, J.: Spatial pyramid pooling based convolutional autoencoder network for loop closure detection. In: IEEE International Conference on Real-time Computing and Robotics (RCAR), pp. 714–719 (2019)
Google Scholar
Zaffar, M., Ehsan, S., Milford, M., McDonald-Maier, K.: CoHOG: a light-weight, compute-efficient, and training-free visual place recognition technique for changing environments. IEEE Robot. Autom. Lett. 5(2), 1835–1842 (2020)
Article Google Scholar
Zaffar, M., Khaliq, A., Ehsan, S., Milford, M., Alexis, K., McDonald-Maier, K.: Are state-of-the-art visual place recognition techniques any good for aerial robotics? arXiv preprint arXiv:1904.07967 (2019)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 40(6), 1452–1464 (2017)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (nos. U1913202, U1813205, U1713213, 61772508), CAS Key Technology Talent Program, Shenzhen Technology Project (nos. JCYJ20180507182610734, JSGG20191129094012321)

Author information

Authors and Affiliations

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Zhenyu Xu, Qieshi Zhang, Fusheng Hao, Ziliang Ren, Yuhang Kang & Jun Cheng
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Beijing, China
Zhenyu Xu, Qieshi Zhang & Jun Cheng
The Chinese University of Hong Kong, Hong Kong, China
Qieshi Zhang, Fusheng Hao, Ziliang Ren, Yuhang Kang & Jun Cheng

Authors

Zhenyu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qieshi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Fusheng Hao
View author publications
You can also search for this author in PubMed Google Scholar
Ziliang Ren
View author publications
You can also search for this author in PubMed Google Scholar
Yuhang Kang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qieshi Zhang .

Editor information

Editors and Affiliations

University of Science and Technology Beijing, Beijing, China
Huimin Ma
Chinese Academy of Sciences, Beijing, China
Liang Wang
Tsinghua University, Beijing, China
Changshui Zhang
Zhejiang University, Hangzhou, China
Fei Wu
Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hunan University, Changsha, China
Yaonan Wang
Sun Yat-Sen University, Guangzhou, Guangdong, China
Jianhuang Lai
Beijing Jiaotong University, Beijing, China
Yao Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, Z., Zhang, Q., Hao, F., Ren, Z., Kang, Y., Cheng, J. (2021). VGG-CAE: Unsupervised Visual Place Recognition Using VGG16-Based Convolutional Autoencoder. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13020. Springer, Cham. https://doi.org/10.1007/978-3-030-88007-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-88007-1_8
Published: 22 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88006-4
Online ISBN: 978-3-030-88007-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics