Abstract
Modern underwater object detection methods recognize objects from sonar data based on their geometric shapes. However, the distortion of objects during data acquisition and representation is seldom considered. In this paper, we present a detailed summary of representations for sonar data and a concrete analysis of the geometric characteristics of different data representations. Based on this, a feature fusion framework is proposed to fully use the intensity features extracted from the polar image representation and the geometric features learned from the point cloud representation of sonar data. Three feature fusion strategies are presented to investigate the impact of feature fusion on different components of the detection pipeline. In addition, the fusion strategies can be easily integrated into other detectors, such as the You Only Look Once (YOLO) series. The effectiveness of our proposed framework and feature fusion strategies is demonstrated on a public sonar dataset captured in real-world underwater environments. Experimental results show that our method benefits both the region proposal and the object classification modules in the detectors.
摘要
现有水下目标检测方法多基于物体的几何形状从声呐数据中识别物体, 这些方法几乎忽略数据采集和数据表征过程所产生的形状畸变问题. 为此, 本文对声呐数据的不同表示形式进行了对比分析, 在此基础上, 提出了一个特征融合框架, 以充分利用从极坐标图像中提取的强度特征和从点云表示形式中学习的几何特征. 该框架中设计了三种特征融合策略, 以分析特征融合对检测器不同模块的影响. 同时, 这些融合策略可以直接集成到其他检测器中, 如YOLO系列. 通过公开水下实景声呐数据集上的一系列对比实验, 验证了所提框架和特征融合策略的有效性. 实验结果表明, 所提特征融合方法对检测器中候选区域模块和分类模块的结果都有所增益.
Data availability
Due to the nature of this research, participants of this study did not agree for their data to be shared publicly, so supporting data are not available.
References
Ben Tamou A, Benzinou A, Nasreddine K, 2021. Multi-stream fish detection in unconstrained underwater videos by the fusion of two convolutional neural network detectors. Appl Intell, 51(8):5809–5821. https://doi.org/10.1007/s10489-020-02155-8
Bochkovskiy A, Wang CY, Liao HYM, 2020. YOLOv4: optimal speed and accuracy of object detection. https://arxiv.org/abs/2004.10934
Charles RQ, Su H, Mo KC, et al., 2017. PointNet: deep learning on point sets for 3D classification and segmentation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.77–85. https://doi.org/10.1109/CVPR.2017.16
Chen K, Wang JQ, Pang JM, et al., 2019. MMDetection: open MMLab detection toolbox and benchmark. https://arxiv.org/abs/1906.07155
Chen XL, Mu XQ, Guan J, et al., 2022. Marine target detection based on Marine-Faster R-CNN for navigation radar plane position indicator images. Front Inform Technol Electron Eng, 23(4):630–643. https://doi.org/10.1631/FITEE.2000611
Chen ZH, Yang CHY, Li QF, et al., 2021. Disentangle your dense object detector. Proc 29th ACM Int Conf on Multimedia, p.4939–4948. https://doi.org/10.1145/3474085.3475351
Feng CJ, Zhong YJ, Gao Y, et al., 2021. TOOD: task-aligned one-stage object detection. Proc IEEE/CVF Int Conf on Computer Vision, p.3490–3499. https://doi.org/10.1109/ICCV48922.2021.00349
Ge Z, Liu ST, Wang F, et al., 2021. YOLOX: exceeding YOLO series in 2021. https://doi.org/10.48550/arXiv.2107.08430
Ghiasi G, Lin TY, Le QV, 2019. NAS-FPN: learning scalable feature pyramid architecture for object detection. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.7029–7038. https://doi.org/10.1109/CVPR.2019.00720
Girshick R, 2015. Fast R-CNN. Proc IEEE Int Conf on Computer Vision, p.1440–1448. https://doi.org/10.1109/ICCV.2015.169
Girshick R, Donahue J, Darrell T, et al., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.580–587. https://doi.org/10.1109/CVPR.2014.81
Girshick R, Donahue J, Darrell T, et al., 2016. Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Patt Anal Mach Intell, 38(1): 142–158. https://doi.org/10.1109/TPAMI.2015.2437384
Huang H, Zhou H, Yang X, et al., 2019. Faster R-CNN for marine organisms detection and recognition using data augmentation. Neurocomputing, 337:372–384. https://doi.org/10.1016/j.neucom.2019.01.084
Kim J, Yu SC, 2016. Convolutional neural network-based realtime ROV detection using forward-looking sonar image. Proc IEEE/OES Autonomous Underwater Vehicles, p. 396–400. https://doi.org/10.1109/AUV.2016.7778702
Kong WZ, Hong JC, Jia MY, et al., 2020. YOLOv3-DPFIN: a dual-path feature fusion neural network for robust realtime sonar target detection. IEEE Sens J, 20(7):3745–3756. https://doi.org/10.1109/JSEN.2019.2960796
Lin TY, Dollár P, Girshick R, et al., 2017. Feature pyramid networks for object detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.936–944. https://doi.org/10.1109/CVPR.2017.106
Liu D, Cheng F, 2021. SRM-FPN: a small target detection method based on FPN optimized feature. Proc 18th Int Computer Conf on Wavelet Active Media Technology and Information Processing, p.506–509. https://doi.org/10.1109/ICCWAMTIP53232.2021.9674107
Otsu N, 1979. A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern, 9(1):62–66. https://doi.org/10.1109/TSMC.1979.4310076.
Pu SL, Zhao W, Chen WJ, et al., 2021. Unsupervised object detection with scene-adaptive concept learning. Front Inform Technol Electron Eng, 22(5):638–651. https://doi.org/10.1631/FITEE.2000567
Redmon J, Farhadi A, 2018. YOLOv3: an incremental improvement. https://doi.org/10.48550/arXiv.1804.02767
Redmon J, Divvala S, Girshick R, et al., 2016. You Only Look Once: unified, real-time object detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.779–788. https://doi.org/10.1109/CVPR.2016.91
Ren SQ, He KM, Girshick R, et al., 2015. Faster R-CNN: towards real-time object detection with region proposal networks. Proc 28th Int Conf on Neural Information Processing Systems, p.91–99.
Song Y, He B, Liu P, 2021. Real-time object detection for AUVs using self-cascaded convolutional neural networks. IEEE J Oceanic Eng, 46(1):56–67. https://doi.org/10.1109/JOE.2019.2950974
Sun PZ, Zhang RF, Jiang Y, et al., 2021. Sparse R-CNN: end-to-end object detection with learnable proposals. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.14449–14458. https://doi.org/10.1109/CVPR46437.2021.01422
Tian MJ, Li XL, Kong SH, et al., 2022. A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot. Front Inform Technol Electron Eng, 23(8):1217–1228. https://doi.org/10.1631/FITEE.2100473
Valdenegro-Toro M, 2016. Object recognition in forward-looking sonar images with convolutional neural networks. Proc OCEANS MTS/IEEE Monterey, p.1–6. https://doi.org/10.1109/OCEANS.2016.7761140
Wang Z, Guo JX, Huang WZ, et al., 2022. Side-scan sonar image segmentation based on multi-channel fusion convolution neural networks. IEEE Sens J, 22(6):5911–5928. https://doi.org/10.1109/JSEN.2022.3149841
Yang GY, Wang ZY, Zhuang SN, 2021. PFF-FPN: a parallel feature fusion module based on FPN in pedestrian detection. Proc Int Conf on Computer Engineering and Artificial Intelligence, p.377–381. https://doi.org/10.1109/ICCEAI52939.2021.00075
Zhang HK, Chang H, Ma BP, et al., 2020. Dynamic R-CNN: towards high quality object detection via dynamic training. Proc 16th European Conf on Computer Vision, p.260–275. https://doi.org/10.1007/978-3-030-58555-6_16
Zhou JC, Zhang DH, Ren WQ, et al., 2022a. Auto color correction of underwater images utilizing depth information. IEEE Geosci Remote Sens Lett, 19:1504805. https://doi.org/10.1109/LGRS.2022.3170702
Zhou JC, Yang TY, Chu WS, et al., 2022b. Underwater image restoration via backscatter pixel prior and color compensation. Eng Appl Artif Intell, 111:104785. https://doi.org/10.1016/j.engappai.2022.104785
Author information
Authors and Affiliations
Contributions
Fei WANG and Jingchun ZHOU designed the research. Fei WANG, Wanyu LI, and Miao LIU processed the data. Fei WANG drafted the paper. Weishi ZHANG helped organize the paper. Fei WANG and Jingchun ZHOU revised and finalized the paper.
Corresponding author
Ethics declarations
Fei WANG, Wanyu LI, Miao LIU, Jingchun ZHOU, and Weishi ZHANG declare that they have no conflict of interest.
Additional information
Project supported by the National Natural Science Foundation of China (No. 62103072) and the Postdoctoral Science Foundation of China (No. 2021M690502)
Rights and permissions
About this article
Cite this article
Wang, F., Li, W., Liu, M. et al. Underwater object detection by fusing features from different representations of sonar data. Front Inform Technol Electron Eng 24, 828–843 (2023). https://doi.org/10.1631/FITEE.2200429
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2200429