Abstract
With the increasing demand for real-world scenarios such as robot navigation and autonomous driving, how to achieve a good trade-off between segmentation accuracy, inference speed and model size has become a core issue for real-time semantic segmentation applications. In this paper, we propose a lightweight attention-guided asymmetric network (LAANet), which adopts an asymmetric encoder–decoder architecture. In the encoder, we propose an efficient asymmetric bottleneck (EAB) module to jointly extract local and context information. In the decoder, we propose an attention-guided dilated pyramid pooling (ADPP) module and an attention-guided feature fusion upsampling (AFFU) module, which are used to aggregate multi-scale context information and fuse features from different layers, respectively. LAANet has only 0.67M parameters, while achieving the accuracy of 73.6% and 67.9\(\%\) mean Intersection over Union (mIoU) at 95.8 and 112.5 Frames Per Second (FPS) on the Cityscapes and CamVid datasets, respectively. The experimental results show that LAANet achieves an optimal trade-off between segmentation accuracy, inference speed, and model size.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bovcon B, Mandeljc R, Perš J et al (2018) Stereo obstacle detection for unmanned surface vehicles by IMU-assisted semantic segmentation. Robot Auton Syst 104:1–13
Zhang X, Chen Z, Wu QMJ et al (2019) Fast semantic segmentation for scene perception. IEEE Trans Ind Inf 15(2):1183–1192
Minaee S, Boykov Y, Porikli F et al (2021) Image segmentation using deep learning: a survey. IEEE Trans Pattern Anal Mach Intell 99:1–1
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3431–3440
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) ENet: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147
Romera E, Alvarez JM, Bergasa LM, Arroyo R (2018) ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272
Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6848–6856
Li G, Y un I, Kim J, Kim J (2019) DABNet: Depth-wise asymmetric bottleneck for real-time semantic segmentation. arXiv preprint arXiv:1907.11357
Woo S, Park J, Lee JY, Kweon I.S (2018) CBAM: Convolutional Block Attention Module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Zhao H, Shi J, Qi X et al (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6230–6239
Chen LC, Papandreou G, Kokkinos I et al (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Emara T, Abd El Munim HE, Abbas HM (2019) LiteSeg: a novel lightweight ConvNet for semantic segmentation. Dig Image Comput Tech Appl (DICTA), pp 1–7
Wang Y, Zhou Q, Liu J et al (2019) LEDNet: A lightweight encoder-decoder network for real-time semantic segmentation. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 1860–1864
Li H, Xiong P, Fan H, Sun J (2019) DFANet: Deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9514–9523
Liu J, Zhou Q, Qiang Y et al (2020) FDDWNet: A lightweight convolutional neural network for real-time semantic segmentation. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2373–2377
Li Y, Li X, Xiao C et al (2021) EACNet: enhanced asymmetric convolution for real-time semantic segmentation. IEEE Signal Proces Lett 28:234–238
Cordts M, Omran M, Ramos S et al (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3213–3223
Brostow GJ, Shotton J, Fauqueur J, Cipolla R (2008) Segmentation and recognition using structure from motion point clouds. In: Proceedings of the European conference on computer vision (ECCV), pp 44–57
Lou A, Loew M (2021) CFPNet: channel-wise feature pyramid for real-time semantic segmentation. arXiv preprint arXiv:2103.12212
Dong G, Yan Y, Shen C, Wang H (2021) Real-time high performance semantic image segmentation of urban street scenes. IEEE Trans Intell Transp Syst 22(6):3258–3274
Zhang XL, Du BC, Luo ZC et al (2021) Lightweight and efficient asymmetric network design for real-time semantic segmentation. Appl Intell. https://doi.org/10.1007/s10489-021-02437-9
Lo SY , Hang HM , Chan SW et al (2018) Efficient dense modules of asymmetric convolution for real-time semantic segmentation. arXiv preprint arXiv:1809.06323
Wang Y, Zhou Q, Wu X (2019) ESNet: An efficient symmetric network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 41–52
Mehta S, Rastegari M, Caspi A et al (2018) ESPNet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 561–580
Yang Z, Yu H, Feng M et al (2020) Small object augmentation of urban scenes for real-time semantic segmentation. IEEE Trans Image Process 29:5175–5190
Sun B, Li J, Shao M et al (2019) LPRNet: lightweight deep network by low-rank pointwise residual convolution. arXiv preprint arXiv:1910.11853
Mehta S, Rastegari M, Shapiro L, Hajishirzi H (2019) ESPNetv2: a light-weight, power efficient, and general purpose convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9182-9192
Jiang W, Xie Z, Li Y et al (2020) LRNNet: A light-weighted network with efficient reduced non-local operation for real-time semantic segmentation. arXiv preprint arXiv:2006.02706
Yu C, Wang J, Gao C et al (2020) Context prior for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 12413–12422
Wang J, Xiong H, Wang H, Nian X (2020) ADSCNEt: asymmetric depthwise separable convolution for semantic segmentation in real-time. Appl Intell 50(4):1045–1056
Gao G, Xu G, Yu Y et al (2021) MSCFNet: a lightweight network with multi-scale context fusion for real-time semantic segmentation. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2021.3098355
Yang Q, Chen T, Fan J et al (2021) EADNet: efficient asymmetric dilated network for semantic segmentation. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2315–2319
Fu J, Liu J, Tian H et al (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3141–3149
Han HY, Chen YC, Hsiao PY, Fu LC (2021) Using channel-wise attention for deep CNN based real-time semantic segmentation with class-aware edge information. IEEE Trans Intell Transp Syst 22(2):1041–1051
Zhang Y, Sun X, Dong J et al (2021) GPNet: gated pyramid network for semantic segmentation. Pattern Recogn. https://doi.org/10.1016/j.patcog.2021.107940
Peng C, Tian T, Chen C et al (2021) Bilateral attention decoder: a lightweight decoder for real-time semantic segmentation. Neural Networks 137:188–199
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3080–3089
Yin L, Hu H (2020) Enhanced global attention upsample decoder based on enhanced spatial attention and feature aggregation module for semantic segmentation. Electron Lett 56(13):659–661
Wu T, Tang S, Zhang R et al (2021) CGNet: a light-weight context guided network for semantic segmentation. IEEE Trans Image Process 30:1169–1179
Wang L, Xu Q, Xiong Z et al (2019) A multi-level feature fusion network for real-time semantic segmentation. In: Proceedings of the International Conference on Wireless Communications and Signal Processing (WCSP), pp 1–6
Liu M, Yin H (2019) Feature pyramid encoding network for real-time semantic segmentation. arXiv preprint arXiv:1909.08599
Liu C, Gao H, Chen A (2020) A real-time semantic segmentation algorithm based on improved lightweight network. In: Proceedings of the International Symposium on Autonomous Systems (ISAS), pp 249–253
Hu X, Jing L, Sehar U (2021) Joint pyramid attention network for real-time semantic segmentation of urban scenes. Appl Intell. https://doi.org/10.1007/s10489-021-02446-8
Lv Q, Sun X, Chen C et al (2021) Parallel complement network for real-time semantic segmentation of road scenes. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2020.3044672
Yu C, Wang J, Peng C et al (2018) BiSeNet: Bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 334–349
Yu C, Gao C, Wang J, et al (2020) BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation. arXiv preprint arXiv:2004.02147
Acknowledgements
This work was supported by the Hebei Provincial Department of education in 2021 provincial postgraduate demonstration course project construction under Grant KCJSX2021024.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, X., Du, B., Wu, Z. et al. LAANet: lightweight attention-guided asymmetric network for real-time semantic segmentation. Neural Comput & Applic 34, 3573–3587 (2022). https://doi.org/10.1007/s00521-022-06932-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-06932-z