{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,1,7]],"date-time":"2025-01-07T05:19:24Z","timestamp":1736227164084,"version":"3.32.0"},"reference-count":56,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2023,10,31]],"date-time":"2023-10-31T00:00:00Z","timestamp":1698710400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea (NRF)","doi-asserted-by":"crossref","award":["NRF-2022R1C1C1008074"],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Institute of Information and Communications Technology Planning and Evaluation (IITP)","award":["RS-2022-00155911"]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"In this paper, we introduce Sonar image Augmentation with Cut and Paste based DataBank for semantic segmentation (SACuP), a novel data augmentation framework specifically designed for sonar imagery. Unlike traditional methods that often overlook the distinctive traits of sonar images, SACuP effectively harnesses these unique characteristics, including shadows and noise. SACuP operates on an object-unit level, differentiating it from conventional augmentation methods applied to entire images or object groups. Improving semantic segmentation performance while carefully preserving the unique properties of acoustic images is differentiated from others. Importantly, this augmentation process requires no additional manual work, as it leverages existing images and masks seamlessly. Our extensive evaluations contrasting SACuP against established augmentation methods unveil its superior performance, registering an impressive 1.10% gain in mean intersection over union (mIoU) over the baseline. Furthermore, our ablation study elucidates the nuanced contributions of individual and combined augmentation methods, such as cut and paste, brightness adjustment, and shadow generation, to model enhancement. We anticipate SACuP\u2019s versatility in augmenting scarce sonar data across a spectrum of tasks, particularly within the domain of semantic segmentation. Its potential extends to bolstering the effectiveness of underwater exploration by providing high-quality sonar data for training machine learning models.<\/jats:p>","DOI":"10.3390\/rs15215185","type":"journal-article","created":{"date-parts":[[2023,10,31]],"date-time":"2023-10-31T16:48:31Z","timestamp":1698770911000},"page":"5185","source":"Crossref","is-referenced-by-count":1,"title":["SACuP: Sonar Image Augmentation with Cut and Paste Based DataBank for Semantic Segmentation"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-2035-751X","authenticated-orcid":false,"given":"Sundong","family":"Park","sequence":"first","affiliation":[{"name":"Department of Software Convergence, Kyung Hee University, Yongin 17104, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-8510-9279","authenticated-orcid":false,"given":"Yoonyoung","family":"Choi","sequence":"additional","affiliation":[{"name":"Department of Software Convergence, Kyung Hee University, Yongin 17104, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3241-8455","authenticated-orcid":false,"given":"Hyoseok","family":"Hwang","sequence":"additional","affiliation":[{"name":"Department of Software Convergence, Kyung Hee University, Yongin 17104, Republic of Korea"}]}],"member":"1968","published-online":{"date-parts":[[2023,10,31]]},"reference":[{"key":"ref_1","first-page":"1097","article-title":"Imagenet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky","year":"2012","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_2","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very Deep Convolutional Networks for Large-Scale Image Recognition. Proceedings of the International Conference on Learning Representations, San Diego, CA, USA."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_4","first-page":"5998","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_5","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_6","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_10","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). Proceedings of the Medical Image Computing and Computer-Assisted Intervention\u2013MICCAI 2015: 18th International Conference, Munich, Germany, October 5\u20139, 2015, Proceedings, Part III 18, Springer."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., and Schiele, B. (2016, January 27\u201330). The cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.350"},{"key":"ref_13","first-page":"3523","article-title":"Image segmentation using deep learning: A survey","volume":"44","author":"Minaee","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Valdenegro-Toro, M. (2016, January 19\u201323). Object recognition in forward-looking sonar images with convolutional neural networks. Proceedings of the OCEANS 2016 MTS\/IEEE Monterey, Monterey, CA, USA.","DOI":"10.1109\/OCEANS.2016.7761140"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Valdenegro-Toro, M. (2016, January 6\u20139). End-to-end object detection and recognition in forward-looking sonar images with convolutional neural networks. Proceedings of the 2016 IEEE\/OES Autonomous Underwater Vehicles (AUV), Tokyo, Japan.","DOI":"10.1109\/AUV.2016.7778662"},{"key":"ref_16","first-page":"23","article-title":"Fundamentals of acoustics","volume":"Volume 1","author":"Hansen","year":"2001","journal-title":"Occupational Exposure to Noise: Evaluation, Prevention and Control"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"105157","DOI":"10.1016\/j.engappai.2022.105157","article-title":"Survey on deep learning based computer vision for sonar imagery","volume":"114","author":"Steiniger","year":"2022","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Sun, C., Shrivastava, A., Singh, S., and Gupta, A. (2017, January 22\u201329). Revisiting unreasonable effectiveness of data in deep learning era. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.97"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1016\/j.neucom.2018.05.083","article-title":"Deep visual domain adaptation: A survey","volume":"312","author":"Wang","year":"2018","journal-title":"Neurocomputing"},{"key":"ref_20","first-page":"1","article-title":"Generalizing from a few examples: A survey on few-shot learning","volume":"53","author":"Wang","year":"2020","journal-title":"ACM Comput. Surv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Figueira, A., and Vaz, B. (2022). Survey on synthetic data generation, evaluation methods and GANs. Mathematics, 10.","DOI":"10.3390\/math10152733"},{"key":"ref_22","unstructured":"Yang, S., Xiao, W., Zhang, M., Guo, S., Zhao, J., and Shen, F. (2022). Image data augmentation for deep learning: A survey. arXiv."},{"key":"ref_23","unstructured":"DeVries, T., and Taylor, G.W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv."},{"key":"ref_24","unstructured":"Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., and Yoo, Y. (November, January 27). Cutmix: Regularization strategy to train strong classifiers with localizable features. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_25","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial nets. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"5528814","DOI":"10.1109\/TGRS.2022.3176216","article-title":"MSLAN: A Two-Branch Multidirectional Spectral\u2013Spatial LSTM Attention Network for Hyperspectral Image Classification","volume":"60","author":"Song","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Sheng, Y., and Xiao, L. (2022, January 17\u201322). Manifold Augmentation Based Self-Supervised Contrastive Learning for Few-Shot Remote Sensing Scene Classification. Proceedings of the IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia.","DOI":"10.1109\/IGARSS46834.2022.9884445"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"109341","DOI":"10.1016\/j.apacoust.2023.109341","article-title":"An underwater small target boundary segmentation method in forward-looking sonar images","volume":"207","author":"Zhang","year":"2023","journal-title":"Appl. Acoust."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.jcp.2017.10.006","article-title":"A review of level-set methods and some recent applications","volume":"353","author":"Gibou","year":"2018","journal-title":"J. Comput. Phys."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zhao, D., Ge, W., Chen, P., Hu, Y., Dang, Y., Liang, R., and Guo, X. (2022). Feature Pyramid U-Net with Attention for Semantic Segmentation of Forward-Looking Sonar Images. Sensors, 22.","DOI":"10.3390\/s22218468"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., and Lo, W.Y. (2023, January 2\u20133). Segment Anything. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Paris, France.","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"ref_32","unstructured":"Wang, L., Ye, X., Zhu, L., Wu, W., Zhang, J., Xing, H., and Hu, C. (2023). When SAM Meets Sonar Images. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Isola, P., Zhu, J.Y., Zhou, T., and Efros, A.A. (2017, January 21\u201326). Image-to-image translation with conditional adversarial networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.632"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Lee, E.h., Park, B., Jeon, M.H., Jang, H., Kim, A., and Lee, S. (2022). Data augmentation using image translation for underwater sonar image segmentation. PLoS ONE, 17.","DOI":"10.1371\/journal.pone.0272602"},{"key":"ref_35","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_37","unstructured":"Tan, M., and Le, Q. (2019, January 9\u201315). Efficientnet: Rethinking model scaling for convolutional neural networks. Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1186\/s40537-019-0197-0","article-title":"A survey on image data augmentation for deep learning","volume":"6","author":"Shorten","year":"2019","journal-title":"J. Big Data"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhang, J., Zhang, Y., and Xu, X. (2021, January 18\u201322). Objectaug: Object-level data augmentation for semantic image segmentation. Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China.","DOI":"10.1109\/IJCNN52387.2021.9534020"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Ghiasi, G., Cui, Y., Srinivas, A., Qian, R., Lin, T.Y., Cubuk, E.D., Le, Q.V., and Zoph, B. (2021, January 19\u201325). Simple copy-paste is a strong data augmentation method for instance segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPR46437.2021.00294"},{"key":"ref_41","unstructured":"Illarionova, S., Nesteruk, S., Shadrin, D., Ignatiev, V., Pukalchik, M., and Oseledets, I. (2021). Object-based augmentation improves quality of remote sensing semantic segmentation. arXiv."},{"key":"ref_42","unstructured":"Kingma, D.P., and Welling, M. (2013). Auto-encoding variational bayes. arXiv."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Gatys, L.A., Ecker, A.S., and Bethge, M. (2016, January 27\u201330). Image style transfer using convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.265"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Manh\u00e3es, M.M.M., Scherer, S.A., Voss, M., Douat, L.R., and Rauschenbach, T. (2016, January 19\u201323). UUV simulator: A gazebo-based package for underwater intervention and multi-robot simulation. Proceedings of the OCEANS 2016 MTS\/IEEE Monterey, Monterey, CA, USA.","DOI":"10.1109\/OCEANS.2016.7761080"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"DeMarco, K.J., West, M.E., and Howard, A.M. (2015, January 19\u201322). A computationally-efficient 2D imaging sonar model for underwater robotics simulations in Gazebo. Proceedings of the OCEANS 2015-MTS\/IEEE Washington, Washington, DC, USA.","DOI":"10.23919\/OCEANS.2015.7404349"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/j.cag.2017.08.008","article-title":"A novel GPU-based sonar simulator for real-time applications","volume":"68","author":"Cerqueira","year":"2017","journal-title":"Comput. Graph."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"101086","DOI":"10.1016\/j.gmod.2020.101086","article-title":"A rasterized ray-tracer pipeline for real-time, multi-device sonar simulation","volume":"111","author":"Cerqueira","year":"2020","journal-title":"Graph. Model."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"706646","DOI":"10.3389\/frobt.2021.706646","article-title":"Physics-based modelling and simulation of multibeam echosounder perception for autonomous underwater manipulation","volume":"8","author":"Choi","year":"2021","journal-title":"Front. Robot. AI"},{"key":"ref_49","unstructured":"Koenig, N., and Howard, A. (October, January 28). Design and use paradigms for gazebo, an open-source multi-robot simulator. Proceedings of the 2004 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), Sendai, Japan."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1016\/j.ifacol.2019.12.322","article-title":"Realistic sonar image simulation using generative adversarial network","volume":"52","author":"Sung","year":"2019","journal-title":"IFAC-PapersOnLine"},{"key":"ref_51","unstructured":"Lee, S., Park, B., and Kim, A. (2018). Deep learning from shallow dives: Sonar image generation and training for underwater object detection. arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Singh, D., and Valdenegro-Toro, M. (2021, January 11\u201317). The marine debris dataset for forward-looking sonar semantic segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00417"},{"key":"ref_53","unstructured":"SoundMetrics (2023, August 07). ARIS Explorer 3000: See What Others Can\u2019t. Available online: http:\/\/www.soundmetrics.com\/products\/aris-sonars\/ARIS-Explorer-3000\/015335_RevD_ARIS-Explorer-3000_Brochure."},{"key":"ref_54","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015, January 7\u201313). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.123"},{"key":"ref_56","unstructured":"Park, T., Efros, A.A., Zhang, R., and Zhu, J.Y. (2020). Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part IX 16, Springer."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/21\/5185\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,6]],"date-time":"2025-01-06T12:52:20Z","timestamp":1736167940000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/21\/5185"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,31]]},"references-count":56,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2023,11]]}},"alternative-id":["rs15215185"],"URL":"https:\/\/doi.org\/10.3390\/rs15215185","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2023,10,31]]}}}