Abstract
As an unsupervised learning method, the autoencoder (AE) plays a very important role in model pre-training. However, the current AEs pre-training methods are still faced with the problems of not being able to reconstruct pictures better and mining deeper features. In this paper, we come up with a new AE, overall improved autoencoder (OIAE). Its main contribution is twofold: Wasserstein Generative Adversarial Networks (WGAN) is used to study the relationship between AEs reconstruction ability and pre-training performance and a regularization method is proposed to enable the autoencoder to learn discriminative features. We set up ablation experiments to prove the effectiveness of our two improvements and OIAE and compare them with baseline. The classification accuracy of the OIAE pre-trained classification network improved by 0.74% on the basic dataset and 16.44% on the more difficult dataset. These promising results demonstrate the effectiveness of our method in AEs pre-training tasks.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Yu Q, Clausi DA. Combining local and global features for image segmentation using iterative classification and region merging. In: Canadian Conference on Computer & Robot Vision. 2005.
Bechar MEA, Settouti N, Daho MEH, Adel M, Chikh MA. Influence of normalization and color features on super-pixel classification: application to cytological image segmentation. Australas Phys Eng Sci Med. 2019;42(2):427–41.
Khanykov IG, Tolstoj IM, Levonevskiy DK. The classification of the image segmentation algorithms. International Journal of Intelligent Unmanned Systems ahead-of-print(ahead-of-print). 2020.
Dornaika F, Chakik F. Efficient object detection and matching using feature classification. 2010.
Javed S, Shah M, Bouwmans T, Soon KJ. Moving object detection on rgb-d videos using graph regularized spatiotemporal rpca. In: International Conference on Image Analysis & Processing. 2017.
Gong T, Liu B, Chu Q, Yu N. Using multi-label classification to improve object detection. Neurocomputing. 2019;370:174–85.
Speier W, Arnold C, Lu J, Taira RK, Pouratian N. Natural language processing with dynamic classification improves p300 speller accuracy and bit rate. J Neural Eng. 2012;9(1): 016004.
Qian C, Zhu X, Ling ZH, Si W, Inkpen D. Enhanced lstm for natural language inference. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017.
Endara L, Cui H, Burleigh JG. Extraction of phenotypic traits from taxonomic descriptions for the tree of life using natural language processing. Appl Plant Sci. 2018;6(3): e1035.
Hu J, Shen L, Albanie S, Sun G, Wu E. Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell. pp 1. 2019.
Fu J, Zheng H, Mei T. Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. 2017. pp 4476–4484.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2016. pp 770–778.
Huang YY, Wang WY. Deep residual learning for weakly-supervised relation extraction. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017.
Luo L, Xiong Y, Liu Y, Sun X. Adaptive gradient methods with dynamic bound of learning rate. 2019.
Sitzmann V, Martel JNP, Bergman AW, Lindell DB, Wetzstein G. Implicit neural representations with periodic activation functions. 2020.
Liu X, Deng Z. Segmentation of drivable road using deep fully convolutional residual network with pyramid pooling. Cogn Comput. 2018.
Rashid H, Tanveer MA, Khan HA. Skin lesion classification using gan based data augmentation. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). 2019.
Wang Y, Cao Y. Human peripheral blood leukocyte classification method based on convolutional neural network and data augmentation. Med Phys. 2019.
Sagheer A, Kotb M. Unsupervised pre-training of a deep lstm-based stacked autoencoder for multivariate time series forecasting problems. Sci Rep. 2019.
Lemme A, Reinhart RF, Steil JJ. Efficient online learning of a non-negative sparse autoencoder. In: European Symposium on Esann. 2015.
Rifai S, Vincent P, Muller X, Glorot X, Bengio Y. Contractive auto-encoders: Explicit invariance during feature extraction. In: ICML. 2011.
Masci J, Meier U, Dan C, Schmidhuber J. Stacked convolutional auto-encoders for hierarchical feature extraction. 2011.
Chen M, Shi X, Zhang Y, Wu D, Guizani M. Deep features learning for medical image analysis with convolutional autoencoder neural network. IEEE Transactions on Big Data. 2017. pp 1.
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res. 2010;11(12):3371–408.
Xu J, Xiang L, Liu Q, Gilmore H, Wu J, Tang J, Madabhushi A. Stacked sparse autoencoder (ssae) for nuclei detection on breast cancer histopathology images. IEEE Trans Med Imaging. 2016;35(1):119–30.
Zhang Q, Zhou J, Zhang B. A noninvasive method to detect diabetes mellitus and lung cancer using the stacked sparse autoencoder. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2020.
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial networks. Adv Neural Inf Proces Syst. 2014;3:2672–80.
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A. Improved training of wasserstein gans. arXiv: Learning. 2017.
Mao Q, Lee H, Tseng H, Ma S, Yang M. Mode seeking generative adversarial networks for diverse image synthesis. 2019. pp 1429–1437.
Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y. An empirical evaluation of deep architectures on problems with many factors of variation. 2007. pp 473–480.
Zhao M. Research on least squares support vector machines algorithm. In: International Industrial Informatics & Computer Engineering Conference. 2015.
He G, Guo J. Prediction model of maximum subsidence in mining area of loess based on support vector machines. Mine Surveying. 2019.
Salakhutdinov R, Hinton GE. Deep boltzmann machines. J Mach Learn Res. 2009;5(2):1967–2006.
Zhu L, Chen L, Zhao D, Zhou J, Zhang W. Emotion recognition from chinese speech for smart affective services using a combination of svm and dbn. Sensors. 2017;17(7):1694.
Wang W, Yin H, Zi H, Sun X, Hung NQV. Restricted boltzmann machine based active learning for sparse recommendation. 2018.
Funding
This work was supported in part by the National Natural Science Foundation of China (No.61876002, No.62076005), Anhui Natural Science Foundation Anhui energy Internet joint fund (No.2008085UD07), Anhui Provincial University Collaborative Innovation Project(No. GXXT-2021-030), Anhui Provincial Key Research and Development Project(No.202104a07020029), and Shenzhen Basic Research Program (JCYJ20170817155854115).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed Consent
Informed consent was not required as no human or animals were involved.
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhao, H., Wu, H. & Wang, X. OIAE: Overall Improved Autoencoder with Powerful Image Reconstruction and Discriminative Feature Extraction. Cogn Comput 15, 1334–1341 (2023). https://doi.org/10.1007/s12559-022-10000-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-022-10000-y