Scene Recognition Method Based on Multi-level Feature Fusion and Attention Module

Computer Science ›› 2022, Vol. 49 ›› Issue (4): 209-214.doi: 10.11896/jsjkx.210100135

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Scene Recognition Method Based on Multi-level Feature Fusion and Attention Module

XU Hua-jie1,2, QIN Yuan-zhuo1, YANG Yang1   

  1. 1 College of Computer and Electronic Information, Guangxi University, Nanning 530004, China;
    2 Guangxi Key Laboratory of Multimedia Communications and Network Technology, Nanning 530004, China
  • Received:2021-01-18 Revised:2021-05-20 Published:2022-04-01
  • About author:XU Hua-jie,born in 1974,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include artificial intelligence,acoustic signal recognition and computer vision.YANG Yang,born in 1995,postgra-duate.Her main research interests include artificial intelligence and compu-ter vision.
  • Supported by:
    This work was supported by the Science and Technology Plan Project of Guangxi Zhuang Autonomous Region(2017AB15008) and Science and Technology Plan Project of Chongzuo(FB2018001).

Abstract: Scene image is usually composed of background information and foreground objects.Convolutional neural network (CNN) used for scene recognition task usually needs to recognize the category of scene according to the characteristics of key objects in the scene, or even combined with the position relationship between objects.Aiming at the problem that the key target features of small size in the scene image gradually disappear with the deepening of the network level, which leads to scene recognition errors, a scene recognition method based on multi-level feature fusion and attention module is proposed.Firstly, the feature extraction part of the deep neural network ResNet-18 is divided into five branches, and then the multi-level features of the output of the five branches are fused, and the fused features are used for scene recognition and classification to make up for the lost target information.Secondly, an improved attention module is added to the network to achieve the purpose of focusing on learning the key targets in the scene image, so as to improve the recognition effect further.Experimental results on several scene datasets show that the recognition accuracy of the proposed method on MIT-67, SUN-397 and UIUC-Sports scene datasets reaches 88.2%, 79.9% and 97.7% respectively, which is higher than the current mainstream scene recognition methods.

Key words: Attention module, Convolutional neural network, Feature fusion, Scene recognition

CLC Number: 

  • TP391
[1] TIAN Y L,ZHANG W T,ZHANG Q S,et al.Review on Image Scene Classification Technology[J].Acta Electronica Sinica,2019,47(4):915-926.
[2] XU J L,LI L Y,WAN X J,et al.Indoor scene recognition me-thod combined with target detection[J].Computer Application,2021,41(3):1-6.
[3] LI X Y,ZHU J,MA L N.Survey of Scene Recognition Methods Based on Deep Learning[J].Computer Engineering and Applications,2020,56(5):25-33.
[4] LUIS H,JIANG S,LI X.Scene Recognition with CNNs:Objects,Scales and Dataset Bias[C]//Proceedings of International Conference on Computer Vision and Pattern Recognition.2016:571-579.
[5] ZHANG L H,LI L Q,PAN X P,et al.Multi-level ensemble network for scene recognition[J].Multimedia Tools and Applications,2019,78(19):28209-28230.
[6] KUDUS A R,TEH C S.Design and Development of Scene Re-cognition and Classification Model Based on Human Preattention Visual Attention[J].Journal of Physics:Conference Series,2021,1755(1):1-12.
[7] WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the 2018 European Conference on Computer Vision.2018:3-19.
[8] BAI S,TANG H D,AN S.Coordinate CNNs and LSTMs to ca-tegrize scene images with multi-views and multi-levels of abstraction[J].Expert Systems with Applications,2019,120:298-309.
[9] BAI S.Growing random forest on deep convolutional neural networks for scene categorization[J].Expert Systems with Applications,2017,71:279-287.
[10] TANG P,WANG H,KWONG S.G-MS2F:GoogleNet basedmulti-stage feature fusion of deep CNN for scene recognition[J].IEEE Geoscience and Remote Sensing Letter,2017,225:188-197.
[11] GUO S,HUANG W,QIAO Y.Locally supervised deep hybrid model for scene recognition[J].IEEE Transactions on Image Processing,2017,26(2):808-820.
[12] ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890.
[13] HU J,LI S,GANG S.Squeeze-and-Excitation Networks[C]//The 2018 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Salt Lake City,UT,USA,2018:7132-7141.
[14] HE K,ZHANG X,REN S,et al.Deep Residual Learning forImage Recognition[C]//Proceedings of the International Confe-rence on Computer Vision and Pattern Recognition.2016:770-778.
[15] QUATTONI A,TORRALBA A.Recognizing indoor scenes[C]//Proceedings of International Conference on Computer Vision and Pattern Recognition.2009:413-420.
[16] XIAO J,HAYS J,EHINGER K A,et al.SUN database:Large-scale Scene Recognition from abbey to zoo[C]//Proceedings of International Conference on Computer Vision and Pattern Re-cognition.2010:3485-3492.
[17] LI L J,LI F F.What,Where and Who?Classifying Events by Scene and Object Recognition[C]//Proceedings of the International Conference on Computer Vision and Pattern Recognition.2007:1-8.
[18] BAI S,TANG H.Categorizing scenes by exploring scene partinformation without constructing explicit models[J].Neurocomputing,2018(281):160-168.
[19] XIE G S,ZHANG X Y,YAN S,et al.Hybrid CNN and dictio-nary-based models for scene recognition and domain adaption[J].IEEE Transaction on Circuits & Systems for Video Technology,2017,27(6):1263-1274.
[20] MENG X,WANG Z,WU L.Building global image features for scene recognition[J].Pattern Recognition,2012(45):373-380.
[21] GAO C,SANG N,HUANG R.Spatial multi-scale gradientorientation consistency for place instance and scene category re-cognition[J].Information Sciences,2016(372):84-97.
[22] SADEGHI F,TAPPEN M F.Latent pyramidal regions for re-cognizing scenes[C]//Proceedings of European Conference on Computer Vision.Florence,2012:228-241.
[23] HUANG C,LUO W,XIE Y.Local-class-shared topic latentdirichlet allocation based scene classification[J].Multi-media Tools and Applications,2017,76(14):15661-15679.
[1] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[2] CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[3] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[4] WEI Kai-xuan, FU Ying. Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising [J]. Computer Science, 2022, 49(8): 120-126.
[5] LIU Dong-mei, XU Yang, WU Ze-bin, LIU Qian, SONG Bin, WEI Zhi-hui. Incremental Object Detection Method Based on Border Distance Measurement [J]. Computer Science, 2022, 49(8): 136-142.
[6] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[7] DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[8] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[9] LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[10] XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[11] YU Shu-hao, ZHOU Hui, YE Chun-yang, WANG Tai-zheng. SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion [J]. Computer Science, 2022, 49(6A): 256-260.
[12] YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[13] YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[14] WU Zi-bin, YAN Qiao. Projected Gradient Descent Algorithm with Momentum [J]. Computer Science, 2022, 49(6A): 178-183.
[15] ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!