Abstract
Underwater target detection and classification based on sonar images is a challenging task because of the complex underwater environment. In recent years, deep learning technology has effectively improved the detection accuracy of underwater targets compared to traditional sonar image target detection methods, which have a low accuracy and poor robustness. However, deep learning algorithms for sonar image target detection have fewer training samples and a low detection speed. To solve these problems, an improved YOLOv4 based sonar target detection and classification algorithm is proposed in this paper. First, the feature extraction network CSPDarknet-53 in YOLOv4 is improved to reduce both the model parameters and the network depth. Second, the PANet feature enhancement module in the YOLOv4 model is replaced by the adaptive spatial feature fusion module (ASFF) to obtain a better feature fusion effect. In addition, the number of fusion feature layers is increased to improve the receptive field and detection accuracy. Furthermore, this paper uses the k-means++ algorithm to cluster the sonar image dataset to obtain the appropriate size and number of anchor boxes for model training. The experimental results show that the proposed method has better performance in detection accuracy and detection speed compared to YOLOv4 and YOLOv4-tiny.
Similar content being viewed by others
Data Availability
The data used to support the findings of this study are included within the article.
References
Arthur D, Vassilvitskii S (2007) K-means++: The advantages of careful seeding. Proc. the eighteenth annual ACM-SIAM symposium on Discrete algorithms 1027–1035
Awad AI, Hassaballah M (2016) Image feature detectors and descriptors: foundations and applications. Stud Comput Intell 630:11–46
Bochkovskiy A, Wang CY, Liao HYM (2020) YOLOv4: Optimal speed and accuracy of object detection. arXiv:2004.10934
Chen K, Chen K, Wang Q, He Z, Hu J, He J (2018) Short-term load forecasting with deep residual networks. IEEE Trans Smart Grid 10 (4):3943–3952
Ferguson EL, Ramakrishnan R, Williams SB, Jin CT (2017) Convolutional neural networks for passive monitoring of a shallow water environment using a single sensor. Proc Int Conf on Acoustics, Speech and Signal Processing 2657–2661
Gao H, Li Y, Pleiss G, Zhang L, Weinberger KQ (2017) Snapshot ensembles: Train 1, get m for free. arXiv:1704.00109
Hassaballah M, Awad AI (2020) Deep learning in computer vision: principles and applications. CRC Press, Boca Raton
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Huo G, Wu Z, Li J (2020) Underwater object classification in sidescan sonar images using deep transfer learning and semisynthetic training data. IEEE Access 8:47407–47418
Image Gallery (2020) http://www.soundmetrics.com/Image-Gallery
Karimanzira D, Renkewitz H, Shea D, Albiez J (2020) Object detection in sonar images. Electronics 9(7):1180
Kong W, Hong J, Jia M, Yao J, Zhang H (2020) YOLOV3-DPFIN: A Dual-Path Feature Fusion Neural Network for Robust Real-Time Sonar Target Detection. IEEE Sens J 20(7):3745–3756
Li J, Liang X, Shen S, Xu T, Feng J, Yan S (2017) Scale-aware fast r-CNN for pedestrian detection. IEEE Trans Multimedia 20(4):985–996
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongle S (2017) Feature pyramid networks for object detection. Proc IEEE Conf Comput Vis Pattern Recognit 2117–2125
Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. arXiv:1911.09516
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. Proc IEEE Conf Comput Vis Pattern Recognit 8759–8768
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Berg AC (2016) Ssd: Single shot multibox detector. Proc European Conf Comput Vis 21–37
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. Proc IEEE Conf Comput Vis Pattern Recognit 779–788
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv:1804.02767
Sung M, Cho H, Kim T, Joe H, Yu SC (2019) Crosstalk removal in forward scan sonar image using deep learning for object detection. IEEE Sens J 19(21):9929–9944
Sung M, Kim J, Lee M, Kim B, Kim T, Kim J, Yu SC (2020) Realistic sonar image simulation using deep learning for underwater object detection. Int J Control Autom Syst 18(3):523–534
Valdenegro-Toro M (2017) Best practices in convolutional networks for forward-looking sonar image recognition. OCEANS 2017-Aberdeen 1–9
Wang CY, Liao HYM, Wu YH, Chen PY, Yeh IH (2020) CSPNEt: A new backbone that can enhance learning capability of cnn.Proc. IEEE Conf Comput Vis and Pattern Recognit Workshops 1571–1580
Williams DP (2017) Underwater target classification in synthetic aperture sonar imagery using deep convolutional neural networks. Proc Int Conf Pattern Recognit 2497–2502
Acknowledgments
This work is supported by the National Natural Science Foundation of China (61801169) and the Applied Basic Research Programs of Changzhou (CJ20200061).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflicts of interest to this work.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fan, X., Lu, L., Shi, P. et al. A novel sonar target detection and classification algorithm. Multimed Tools Appl 81, 10091–10106 (2022). https://doi.org/10.1007/s11042-022-12054-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12054-4