Abstract
In view of the small number of categories and the relatively little amount of labeled data, it is challenging to apply the fusion of deep convolution features directly to remote sensing images. To address this issue, we propose a pyramid multi-subset feature fusion method, which can effectively fuse the deep features extracted from different pre-trained convolutional neural networks and integrate the global and local information of the deep features, thereby obtaining stronger discriminative and low-dimensional features. By introducing the idea of weighting the difference between different categories, the weight discriminant correlation analysis method is designed to make it pay more attention to those categories that are not easy to distinguish. In order to mine global and local feature information, the pyramid method is employed to divide feature fusion into several layers. Each layer divides the features into several subsets and then performs feature fusion on the corresponding feature subsets, and the number of subsets from top to bottom gradually increases. Feature fusion at the top of the pyramid obtains a global representation, while feature fusion at the bottom obtains a local detail representation. Our experiment results on three public remote sensing image data sets demonstrate that the proposed multi-deep features fusion method produces improvements over other state-of-the-art deep learning methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Andrew G, Arora R, Bilmes J, Livescu K (2013) Deep canonical correlation analysis. In: International conference on international conference on machine learning, pp III–1247
Anwer RM, Khan FS, van de Weijer J, Molinier M, Laaksonen J (2018) Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification. ISPRS J Photogramm Remote Sens 138:74–85
Castelluccio M, Poggi G, Sansone C, Verdoliva L (2015) Land use classification in remote sensing images by convolutional neural networks. arXiv preprint arXiv:1508.00092
Chaib S, Liu H, Gu Y, Yao H (2017) Deep feature fusion for VHR remote sensing scene classification. IEEE Trans Geosci Remote Sens 55(8):4775–4784
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27. https://doi.org/10.1145/1961189.1961199
Chaudhuri B, Demir B, Chaudhuri S, Bruzzone L (2018) Multilabel remote sensing image retrieval using a semisupervised graph-theoretic method. IEEE Trans Geosci Remote Sens 56(2):1144–1158
Chen C, Zhang B, Su H, Li W, Wang L (2016) Land-use scene classification using multi-scale completed local binary patterns. Signal Image video Process 10(4):745–752. https://doi.org/10.1007/s11760-015-0804-2
Chen S, Tian Y (2015) Pyramid of spatial relatons for scene-level land use classification. IEEE Trans Geosci Remote Sens 53(4):1947–1957. https://doi.org/10.1109/TGRS.2014.2351395
Cheng G, Han J, Lu X (2017) Remote sensing image scene classification: benchmark and state of the art. Proc IEEE 105(10):1865–1883. https://doi.org/10.1109/JPROC.2017.2675998
Cheng G, Yang C, Yao X, Guo L, Han J (2018) When deep learning meets metric learning: remote sensing image scene classification via learning discriminative cnns. IEEE Trans Geosci Remote Sens 56(5):2811–2821
Cheng G, Zhou P, Han J (2016) Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Trans Geosci Remote Sens 54(12):7405–7415
Dao-Qiang PYZ (2008) Semi-supervised canonical correlation analysis algorithm. J Softw 11:008
Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Flores E, Zortea M, Scharcanski J (2019) Dictionaries of deep features for land-use scene classification of very high spatial resolution images. Pattern Recognit 89:32–44
Georganos S, Grippa T, Vanhuysse S, Lennert M, Shimoni M, Wolff E (2018) Very high resolution object-based land use-land cover urban classification using extreme gradient boosting. IEEE Geosci Remote Sens Lett 15:607–611
Guo Y, Ji J, Lu X, Huo H, Fang T, Li D (2019) Global-local attention network for aerial scene classification. IEEE Access 7:67200–67212
Haghighat M, Abdel-Mottaleb M, Alhalabi W (2016) Discriminant correlation analysis: real-time feature level fusion for multimodal biometric recognition. IEEE Trans Inf Forensics Secur 11(9):1984–1996
Han W, Feng R, Wang L, Cheng Y (2018) A semi-supervised generative framework with deep learning features for high-resolution remote sensing image scene classification. ISPRS J Photogramm Remote Sens 145:23–43
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hotelling H (1936) Relations between two sets of variates. Biometrika 28(3/4):321–377
Hu F, Xia GS, Hu J, Zhang L (2015) Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery. Remote Sens 7(11):14680–14707
Huang G, Liu Z, Van Der Maaten L, Weinberger K.Q (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Huang H, Xu K (2019) Combing triple-part features of convolutional neural networks for scene classification in remote sensing. Remote Sens 11(14):1687
Kassawmar T, Eckert S, Hurni K, Zeleke G, Hurni H (2018) Reducing landscape heterogeneity for improved land use and land cover (lulc) classification across the large and complex ethiopian highlands. Geocarto Int 33(1):53–69
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2, pp 2169–2178. IEEE. https://doi.org/10.1109/CVPR.2006.68
Li K, Zou C, Bu S, Liang Y, Zhang J, Gong M (2018) Multi-modal feature fusion for geographic image annotation. Pattern Recognit 73:1–14
Li L, Ge H, Gao J, Zhang Y (2019) Hyperspectral image feature extraction using Maclaurin series function curve fitting. Neural Process Lett 49(1):357–374
Li L, Ge H, Gao J, Zhang Y, Tong Y, Sun J (2020) A novel geometric mean feature space discriminant analysis method for hyperspectral image feature extraction. Neural Process Lett 51(1):515–542
Li Y, Zhang Y, Huang X, Zhu H, Ma J (2018) Large-scale remote sensing image retrieval by deep hashing neural networks. IEEE Trans Geosci Remote Sens 56(2):950–965
Liang M, Jiao L, Yang S, Liu F, Hou B, Chen H (2018) Deep multiscale spectral-spatial feature fusion for hyperspectral images classification. IEEE J Sel Top Appl Earth Obs Remote Sens 11(8):2911–2924
Liu G, Yang J, Li Z (2015) Content-based image retrieval using computational visual attention model. Pattern Recognit 48(8):2554–2566. https://doi.org/10.1016/j.patcog.2015.02.005
Liu Y, Liu Y, Ding L (2018) Scene classification based on two-stage deep feature fusion. IEEE Geosci Remote Sens Lett 15(2):183–186
Marmanis D, Datcu M, Esch T, Stilla U (2016) Deep learning earth observation classification using imagenet pretrained networks. IEEE Geosci Remote Sens Lett 13(1):105–109. https://doi.org/10.1109/LGRS.2015.2499239
Melzer T, Reiter M, Bischof H (2003) Appearance models based on kernel canonical correlation analysis. Pattern Recognit 36(9):1961–1971
Miao Q, Li Y, Ouyang W, Ma Z, Xu X, Shi W, Cao X (2018) Multimodal gesture recognition based on the resc3d network. In: IEEE International conference on computer vision workshop, pp 3047–3055
Monwar MM, Gavrilova ML (2009) Multimodal biometric system using rank-level fusion approach. IEEE Trans Syst Man Cybern Part B (Cybern) 39(4):867–878
Napoletano P (2018) Visual descriptors for content-based retrieval of remote-sensing images. Int J Remote Sens 39(5):1343–1376
Nogueira K, Penatti O, dos Santos J (2016, 2017) Towards better exploiting convolutional neural networks for remote sensing scene classification. Pattern Recognit 61:539–556. https://doi.org/10.1016/j.patcog.2016.07.001
Nurwanda A, Honjo T (2018) Analysis of land use change and expansion of surface urban heat island in bogor city by remote sensing. ISPRS Int J Geo-Inf 7(5):165
Othman E, Bazi Y, Alajlan N, Alhichri H, Melgani F (2016) Using convolutional features and a sparse autoencoder for land-use scene classification. Int J Remote Sens 37(10):2149–2167
Pathiranage ISS, Kantakumar LN, Sundaramoorthy S (2018) Remote sensing data and sleuth urban growth model: as decision support tools for urban planning. Chin Geogr Sci 28(2):274–286
Penatti OA, Nogueira K, dos Santos JA (2015) Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 44–51
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Song W, Li S, Fang L, Lu T (2018) Hyperspectral image classification with deep feature fusion network. IEEE Trans Geosci Remote Sens 56(6):3173–3184
Sun QS, Liu ZD, Heng PA, Xia DS (2005) A theorem on the generalized canonical projective vectors. Pattern Recognit 38(3):449–452
Sun QS, Zeng SG, Liu Y, Heng PA, Xia DS (2005) A new method of feature fusion and its application in image recognition. Pattern Recognit 38(12):2437–2448
Sun T, Chen S, Yang J, Shi P (2008) A novel method of combined feature extraction for recognition. In: Eighth IEEE international conference on data mining, 2008. ICDM’08. IEEE, pp 1043–1048
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, vol 4, p 12
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Wang Q, Liu S, Chanussot J, Li X (2018) Scene classification with recurrent attention of vhr remote sensing images. IEEE Trans Geosci Remote Sens 57(2):1155–1167
Wang W, Arora R, Livescu K, Bilmes J (2015) On deep multi-view representation learning. In: International conference on machine learning, pp 1083–1092
Xia GS, Hu J, Hu F, Shi B, Bai X, Zhong Y, Zhang L, Lu X (2017) Aid: a benchmark data set for performance evaluation of aerial scene classification. IEEE Trans Geosci Remote Sens 55(7):3965–3981
Xu K, Huang H, Li Y, Shi G (2020) Multilayer feature fusion network for scene classification in remote sensing. IEEE Geosci Remote Sens Lett 99:1–5
Yang J, Jiang YG, Hauptmann AG, Ngo CW (2007) Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the international workshop on multimedia information retrieval. ACM, pp 197–206
Yang J, Yang JY, Zhang D, Lu JF (2003) Feature fusion: parallel strategy vs. serial strategy. Pattern Recognit 36(6):1369–1381
Yang Y, Newsam S (2010) Bag-of-visual-words and spatial extensions for land-use classification. In: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 270–279. https://doi.org/10.1145/1869790.1869829
Yu Y, Gong Z, Wang C, Zhong P, (2017, 2018) An unsupervised convolutional feature fusion network for deep representation of remote sensing images. IEEE Geosci Remote Sens Lett 15(1):23–27
Yu Y, Li X, Liu F (2019) Attention gans: unsupervised deep feature learning for aerial scene classification. IEEE Trans Geosci Remote Sens 58(1):519–531
Yu Y, Liu F (2018) A two-stream deep fusion framework for high-resolution aerial scene classification. Comput Intell Neurosci 2018:1–13
Yuan B, Li S, Li N (2018) Multiscale deep features learning for land-use scene recognition. J Appl Remote Sens 12(1):015010
Zhang F, Du B, Zhang L (2016) Scene classification via a gradient boosting random convolutional network framework. IEEE Trans Geosci Remote Sens 54(3):1793–1802
Zhang L, Zhang L, Du B (2016) Deep learning for remote sensing data: a technical tutorial on the state of the art. IEEE Geosci Remote Sens Mag 4(2):22–40
Zhou W, Newsam S, Li C, Shao Z (2018) Patternnet: a benchmark dataset for performance evaluation of remote sensing image retrieval. ISPRS J Photogramm Remote Sens 145:197–209
Zhu J, Hu J, Jia S, Jia X, Li Q (2018) Multiple 3-d feature fusion framework for hyperspectral image classification. IEEE Trans Geosci Remote Sens 56(4):1873–1886
Acknowledgements
This work is supported in part by the Hong Kong Research Grants Council (Project C1007-15G) and the Hong Kong Institute for Data Science.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yuan, B., Han, L., Gu, X. et al. Multi-deep features fusion for high-resolution remote sensing image scene classification. Neural Comput & Applic 33, 2047–2063 (2021). https://doi.org/10.1007/s00521-020-05071-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05071-7