LVNet: A lightweight volumetric convolutional neural network for real-time and high-performance recognition of 3D objects

Li, Lianwei; Qin, Shiyin; Yang, Ning; Hong, Li; Dai, Yang; Wang, Zhiqiang

doi:10.1007/s11042-023-17816-2

LVNet: A lightweight volumetric convolutional neural network for real-time and high-performance recognition of 3D objects

Published: 04 January 2024

Volume 83, pages 61047–61063, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Lianwei Li¹,
Shiyin Qin^2,3,
Ning Yang¹,
Li Hong¹,
Yang Dai¹ &
…
Zhiqiang Wang¹

210 Accesses
1 Altmetric
Explore all metrics

Abstract

The 3D object recognition has become one of hot topics in computer vision with the increasing of application scenarios of 3D data, such as robotic systems, autonomous driving, and security check systems using active millimeter wave. Although 3D convolutional neural network (CNN) has achieved some good results in 3D object recognition, its key performances such as computational efficiency and realtimeness still need to be improved due to its huge amount of parameters of 3D convolutions. In this paper, we present a powerful tool LVNet which is a lightweight volumetric CNN designed for real-time and high-performance recognition of 3D objects. Meanwhile, all of standard 3D convolutions are replaced with depthwise separable convolutions in the LVNet so as to reduce the model size and computation complexity. Furthermore, the attention mechanism is combined with the depthwise separable convolutions to compensate for the performance loss caused by the reduction of parameter number. In order to further improve the performance of LVNet, some auxiliary methods are employed also, such as data augmentation with multiple rotations of objects and information fusion of different orientations. A series of experimental results on public datasets show that the proposed LVNet achieves competitive recognition performance with less burden of computation and memory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

3D Depthwise Convolution: Reducing Model Parameters in 3D Vision Tasks

3D convolutional neural network for object recognition: a review

Article 11 December 2018

Multi-scale Lightweight Neural Network for Real-Time Object Detection

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Rani S, Lakhwani K, Kumar S (2022) Three dimensional objects recognition & pattern recognition technique; related challenges: A review. Multimed Tools Appl 81(12):17303–17346
Article Google Scholar
Li B, Zhang Y, Sun F (2022) Deep residual neural network based PointNet for 3D object part segmentation. Multimed Tools Appl 81:11933–11947
Article Google Scholar
Zhong Y, Sun Z, Luo S, Sun Y, Wang Y (2022) Video supervised for 3D reconstruction from single image. Multimed Tools Appl 81(11):15061–15083
Article Google Scholar
Liang J, Zhou T, Liu D, Wang W (2023) CLUSTSEG: Clustering for Universal Segmentation. arXiv:2305.02187
Wang W, Liang J, Liu D (2022) Learning equivariant segmentation with instance-unique querying. Adv Neural Inf Process Syst 35:12826–12840
Google Scholar
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp 945-953
Xu Y, Zheng C, Xua R, Quan Y, Ling H (2021) Multi-View 3D Shape Recognition via Correspondence-Aware Deep Learning. IEEE Trans Image Process 30:5299–5312
Article Google Scholar
Qi CR, Su H, Mo K, Guibas LJ (2017) PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652-660
Qi CR, Yi L, Su H, Guibas LJ (2017) PointNet++: Deep hierarchical feature learning on point sets in a metric space. arXiv:1706.02413
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912-1920
Maturana D, Scherer S (2015) VoxNet: A 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 922-928
Sedaghat N, Zolfaghari M, Amiri E, Brox T (2016) Orientation-boosted voxel nets for 3D object recognition. arXiv:1604.03351
Qi CR, Su H, NieSSner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view CNNs for object classification on 3D data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5648-5656
Brock A, Lim T, Ritchie JM, Weston N (2016) Generative and discriminative voxel modeling with convolutional neural networks. arXiv:1608.04236
Wang C, Cheng M, Sohel F, Bennamoun M, Li J (2019) NormalNet: A voxel-based CNN for 3D object classification and retrieval. Neurocomputing 323:139–147
Article Google Scholar
Kumawat S, Raman S (2019) LP-3DCNN: Unveiling local phase in 3D convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4903-4912
Zhi S, Liu Y, Li X, Guo Y (2017) LightNet: A Lightweight 3D Convolutional Neural Network for Real-Time 3D Object Recognition. In: Proceedings of the workshop on 3D object retrieval, pp 9-16
Ma C, Guo Y, Lei Y, An W (2018) Binary volumetric convolutional neural networks for 3-D object recognition. IEEE Trans Instrum Meas 68(1):38–48
Article Google Scholar
Tran D, Wang H, Torresani L, Ray J, LeCun Y, Paluri M (2018) A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6450-6459
Xie S, Sun C, Huang J, Tu Z, Murphy K (2018) Rethinking spatiotemporal feature learning: Speed-accuracy trade-offs in video classification. In: Proceedings of the European conference on computer vision (ECCV), pp 305-321
Li L, Qin S, Lu Z, Zhang D, Xu K, Hu Z (2021) Real-time one-shot learning gesture recognition based on lightweight 3D Inception-ResNet with separable convolutions. Pattern Anal Appl 24(3):1173–1192
Article Google Scholar
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Hu Z, Hu Y, Liu J, Wu B, Han D, Kurfess T (2018) 3D separable convolutional neural network for dynamic hand gesture recognition. Neurocomputing 318:151–161
Article Google Scholar
Liu T, Wang J, Huang X, Lu Y, Bao J (2022) 3DSMDA-Net: An improved 3DCNN with separable structure and multi-dimensional attention for welding status recognition. J Manuf Syst 62:811–822
Article Google Scholar
Liu D, Liang J, Geng T, Loui A, Zhou T (2023) Tripartite feature enhanced pyramid network for dense prediction. IEEE Trans Image Process 32:2678–2692
Article Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132-7141
Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3-19
De Deuge M, Quadros A, Hung C, Douillard B (2013) Unsupervised feature learning for classification of outdoor 3D scans. In: Australasian conference on robitics and automation, pp 1-9
Chang AX, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, Savarese S, Savva M, Song S, Su H (2015) ShapeNet: An information-rich 3D model repository. arXiv:1512.03012
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026-1034
Hegde V, Zadeh R (2016) FusionNet: 3D object classification using multiple data representations. arXiv:1607.05695
Gomez-Donoso F, Escalona F, Cazorla M (2020) Par3DNet: Using 3DCNNs for object recognition on tridimensional partial views. Appl Sci 10(10):3409
Article Google Scholar
Liu M, Shi Y, Zheng L, Xu K, Huang H, Manocha D (2019) Recurrent 3D attentional networks for end-to-end active object recognition. Comput Vis Med 5(1):91–104
Article Google Scholar
Han C, Wang Q, Cui Y, Cao Z, Wang W, Qi S, Liu D (2023) E2VPT: An effective and efficient approach for visual prompt tuning. arXiv:2307.13770

Download references

Author information

Authors and Affiliations

Information Science Academy of China Electronics Technology Group Corporation, Beijing 100086, China
Lianwei Li, Ning Yang, Li Hong, Yang Dai & Zhiqiang Wang
School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China
Shiyin Qin
School of Electrical Engineering and Intelligentization, Dongguan University of Technology, Dongguan 523808, China
Shiyin Qin

Authors

Lianwei Li
View author publications
You can also search for this author in PubMed Google Scholar
Shiyin Qin
View author publications
You can also search for this author in PubMed Google Scholar
Ning Yang
View author publications
You can also search for this author in PubMed Google Scholar
Li Hong
View author publications
You can also search for this author in PubMed Google Scholar
Yang Dai
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lianwei Li.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, L., Qin, S., Yang, N. et al. LVNet: A lightweight volumetric convolutional neural network for real-time and high-performance recognition of 3D objects. Multimed Tools Appl 83, 61047–61063 (2024). https://doi.org/10.1007/s11042-023-17816-2

Download citation

Received: 14 September 2023
Revised: 31 October 2023
Accepted: 30 November 2023
Published: 04 January 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s11042-023-17816-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

LVNet: A lightweight volumetric convolutional neural network for real-time and high-performance recognition of 3D objects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

3D Depthwise Convolution: Reducing Model Parameters in 3D Vision Tasks

3D convolutional neural network for object recognition: a review

Multi-scale Lightweight Neural Network for Real-Time Object Detection

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

LVNet: A lightweight volumetric convolutional neural network for real-time and high-performance recognition of 3D objects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

3D Depthwise Convolution: Reducing Model Parameters in 3D Vision Tasks

3D convolutional neural network for object recognition: a review

Multi-scale Lightweight Neural Network for Real-Time Object Detection

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation