Simultaneous Localization and Segmentation of Fish Objects Using Multi-task CNN and Dense CRF

Labao, Alfonso B.; Naval, Prospero C.

doi:10.1007/978-3-030-14799-0_52

Alfonso B. Labao¹⁸ &
Prospero C. Naval Jr.¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11431))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

2034 Accesses
8 Citations

Abstract

We propose a deep learning tool to localize fish objects in benthic underwater videos on a frame by frame basis. The deep network predicts fish object spatial coordinates and simultaneously segments the corresponding pixels of each fish object. The network follows a state of the art inception resnet v2 architecture that automatically generates informative features for object localization and mask segmentation tasks. Predicted masks are passed to dense Conditional Random Field (CRF) post-processing for contour and shape refinement. Unlike prior methods that rely on motion information to segment fish objects, our proposed method only requires RGB video frames to predict both box coordinates and object pixel masks. Independence from motion information makes our proposed model more robust to camera movements or jitters, and makes it more applicable to process underwater videos taken from unmanned water vehicles. We test the model in actual benthic underwater video frames taken from ten different sites. The proposed tool can segment fish objects despite wide camera movements, blurred underwater resolutions, and is robust to a wide variety of environments and fish species shapes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 5719; Price includes VAT (Japan)

Softcover Book: JPY 7149; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-scale feature map fusion encoding for underwater object segmentation

Article 13 December 2024

How to track and segment fish without human annotations: a self-supervised deep learning approach

Article Open access 23 February 2024

Weakly-Labelled Semantic Segmentation of Fish Objects in Underwater Videos Using a Deep Residual Network

References

Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media Inc., Sebastopol (2008)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. arXiv preprint arXiv:1606.00915 (2016)
Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. arXiv preprint arXiv:1512.04412 (2015)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Fier, R., Albu, A.B., Hoeberechts, M.: Automatic fish counting system for noisy deep-sea videos. In: Oceans-St. John’s 2014, pp. 1–6. IEEE (2014)
Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988. IEEE (2017)
Google Scholar
Joly, A., et al.: Lifeclef: multimedia life species identification. In: EMR@ ICMR, pp. 7–13 (2014)
Google Scholar
Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. Adv. Neural Inf. Process. Syst. 2(3), 4 (2011)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Kumar Rai, R., Gour, P., Singh, B.: Underwater image segmentation using clahe enhancement and thresholding. Int. J. Emerg. Technol. Adv. Eng. 2(1), 118–123 (2012)
Google Scholar
Labao, A.B., Naval, P.C.: Weakly-labelled semantic segmentation of fish objects in underwater videos using a deep residual network. In: Nguyen, N.T., Tojo, S., Nguyen, L.M., Trawiński, B. (eds.) ACIIDS 2017. LNCS (LNAI), vol. 10192, pp. 255–265. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54430-4_25
Chapter Google Scholar
Li, X., Shang, M., Hao, J., Yang, Z.: Accelerating fish detection and recognition by sharing CNNs with objectness learning. In: OCEANS 2016-Shanghai, pp. 1–5. IEEE (2016)
Google Scholar
Li, X., Shang, M., Qin, H., Chen, L.: Fast accurate fish detection and recognition of underwater images with fast R-CNN. In: OCEANS 2015-MTS/IEEE Washington, pp. 1–5. IEEE (2015)
Google Scholar
Lin, G., Shen, C., van den Hengel, A., Reid, I.: Efficient piecewise training of deep structured models for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3194–3203 (2016)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Negahdaripour, S., Yu, C.H.: On shape and range recovery from image shading for underwater applications. Underwater Robot. Veh.: Des. Control 221–250 (1995)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
Spampinato, C., Chen-Burger, Y.H., Nadarajan, G., Fisher, R.B.: Detecting, tracking and counting fish in low quality unconstrained underwater videos. VISAPP 2(2008), 514–519 (2008)
Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016)
Twilley, N., Graber, C.: Gastropod: How many fish are in the sea? counting fish is a daunting but essential task in protecting aquatic ecosystems-and now artificial intelligence, autonomous submarines, and drones can help. https://www.theatlantic.com/science/archive/2016/10/how-many-fish-are-in-the-sea/502937/

Download references

Author information

Authors and Affiliations

Computer Vision and Machine Intelligence Group, Department of Computer Science, College of Engineering, University of the Philippines, Diliman, Quezon City, Philippines
Alfonso B. Labao & Prospero C. Naval Jr.

Authors

Alfonso B. Labao
View author publications
You can also search for this author in PubMed Google Scholar
Prospero C. Naval Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prospero C. Naval Jr. .

Editor information

Editors and Affiliations

Ton Duc Thang University, Ho Chi Minh City, Vietnam
Ngoc Thanh Nguyen
Bina Nusantara University, Jakarta, Indonesia
Ford Lumban Gaol
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
Wrocław University of Science and Technology, Wrocław, Poland
Bogdan Trawiński

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Labao, A.B., Naval, P.C. (2019). Simultaneous Localization and Segmentation of Fish Objects Using Multi-task CNN and Dense CRF. In: Nguyen, N., Gaol, F., Hong, TP., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2019. Lecture Notes in Computer Science(), vol 11431. Springer, Cham. https://doi.org/10.1007/978-3-030-14799-0_52

Download citation

DOI: https://doi.org/10.1007/978-3-030-14799-0_52
Published: 07 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14798-3
Online ISBN: 978-3-030-14799-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics