A Fully Convolutional Network with Waterfall Atrous Spatial Pooling and Localized Active Contour Loss for Fish Segmentation
DOI:
https://doi.org/10.4108/eetinis.v10i1.2942Keywords:
Fully convolutional network, DeepFish, SIUM fish data, Waterfall Atrous Spatial PoolingAbstract
Accurate measurements and statistics of fish data are important for sustainable development of aqua-enviroment and marine fisheries. For data measurements and statistics, automatic segmentation of fish is one of key tasks. The fish segmentation however is a challenging task due to arterfacts in underwater images. In this study, we introduce a deep-learning approach, namely FCN-WRN-WASP for automatic fish segmentation from the underwater images. In particular, we introduce a computational-efficient variation called Waterfall Atrous Spatial Pooling (WASP) module into a Fully convolutional network with Wide ResNet baseline. We also proposed a loss function inspired from active contour approach that can exploit the local intensity information from the input image. The approach has been validated on the DeepFish data and the SIUM data set. The results are promissing for fish segmentation, with higher Intersection over Union (IoU) scores compared to state of the arts. The evaluation results showed that the incorporation of the image based active contour loss helps increase the segmentation performance. In addition, the use of the WASP in the architecture is effective especially for forground fish segmentation.
Downloads
References
Hussain, M.A., Saputra, T., Szabo, E.A., Nelan, B.: An overview of seafood supply, food safety and regulation in New South Wales, Australia. Foods 6(7), 52 (2017). doi:https ://doi.org/10.3390/foods DOI: https://doi.org/10.3390/foods6070052
Saleh, A., H. Laradji, I., A. Konovalov, D., Bradley, M., Vazquez, D., Sheaves , M.: A Realistic Fish Habitat Dataset to Evaluate Algorithms for Underwater Visual Analysis. Scientific Reports 10 Article number: 14671 (2020). doi:DOI: 10.1038/s41598-020-71639-x DOI: https://doi.org/10.1038/s41598-020-71639-x
Delgado, C., Wada, N., Rosegrant, M., Meijer, S., Ahmed, M.: Fish to 2020: Supply and demand in changing global markets. World Fish Center Technical Report 62 (2003).
Yang, L., Liu, Y., Yu, H., Fang, X., Song, L., Daoliang, L., Chen, Y.: Computer Vision Models in Intelligent Aquaculture with Emphasis on Fish Detection and Behavior Analysis: A Review. Archives of Computational Methods in Engineering 28, 2785–2816 (2021). DOI: https://doi.org/10.1007/s11831-020-09486-2
Ahmed, M.N., Yamany, S.M., Mohamed, N., Farag , A.A., Moriarty, T.: A modified fuzzy C-means algorithm for bias field estimation and segmentation of MRI data. IEEE Trans. Med. Imaging 21(3), 193-199 (2002). DOI: https://doi.org/10.1109/42.996338
Tran, T.T., Pham, V.T., Shyu, K.K.: Image segmentation using fuzzy energy-based active contour with shape prior. J. Vis. Commun. Image Represent. 25(7), 1732-1745 (2014). DOI: https://doi.org/10.1016/j.jvcir.2014.06.006
J. Long, E. Shelhamer, T. Darrell: Fully convolutional networks for semantic segmentation. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3431–3440 (2015). DOI: https://doi.org/10.1109/CVPR.2015.7298965
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, pp. 770-778
Artacho, B., Savakis. A.: Waterfall atrous spatial pooling architecture for efficient semantic segmentation. Sensors 19(4), 5661 (2019). doi:https://doi.org/10.3390/s19245361 DOI: https://doi.org/10.3390/s19245361
Islam, M.J., Edge, C., Xiao, Y., Luo, P., Mehtaz, M., Morse, C., Enan, S., Sattar, J.: Semantic segmentation of underwater imagery: dataset and benchmark. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020, pp. 1769–1776 DOI: https://doi.org/10.1109/IROS45743.2020.9340821
O'Shea, K., Nash, R.: An Introduction to Convolutional Neural Networks. arXiv:1511.08458 (2015).
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. 2015, pp. 234-241 DOI: https://doi.org/10.1007/978-3-319-24574-4_28
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid Scene Parsing Network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017 DOI: https://doi.org/10.1109/CVPR.2017.660
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., L. Yuille, A.: DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(4), 834 - 848 (2018). DOI: https://doi.org/10.1109/TPAMI.2017.2699184
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016 DOI: https://doi.org/10.1109/CVPR.2016.90
Zagoruyko, S., Komodakis, N.: Wide Residual Networks. In: Proceedings of the British Machine Vision Conference (BMVC) 2016, pp. 87.81-87.12 DOI: https://doi.org/10.5244/C.30.87
Shorya, S.: Semantic Segmentation for Urban-Scene Images. arXiv:2110.13813 (2021 ). doi:https://doi.org/10.48550/arXiv.2110.13813
Baevski, A., Auli, M.: Adaptive input representations for neural language modeling. In: The International Conference on Learning Representations (ICLR) 2019
Fowlkes, C., Martin, D., Malik, J.: Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2003, pp. II-54
Krähenbühl, P., Koltun, V.: Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. In: Advances in Neural Information Processing Systems 24 2012, pp. 109-117
Li, C., Kao, C.Y., C. Gore, J., Ding, Z.: Minimization of Region-Scalable Fitting Energy for Image Segmentation. IEEE Transactions on Image Processing 17(10), 1940 - 1949 (2008). DOI: https://doi.org/10.1109/TIP.2008.2002304
Lankton, S., Tannenbaum, A.: Localizing Region-Based Active Contours. IEEE Transactions on Image Processing 17(11), 2029 - 2039 (2008). DOI: https://doi.org/10.1109/TIP.2008.2004611
Chen, X., M. Williams, B., R. Vallabhaneni, S., Czanner, G., Williams, R., Zheng, Y.: Learning Active Contour Models for Medical Image Segmentation. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 11623-11640 (2019). DOI: https://doi.org/10.1109/CVPR.2019.01190
Zhang, W., Wu, C., Bao, Z.: DPANet: Dual Pooling‐aggregated Attention Network for fish segmentation. IET computer vision, 67-82 (2021). DOI: https://doi.org/10.1049/cvi2.12065
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017). DOI: https://doi.org/10.1109/TPAMI.2016.2644615
Chen, L., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017). doi: http://arxiv.org/abs/1706.05587
Zhang, L., Li, X., Arnab, A., Yang, K., Tong, Y., Torr, P.: Dual graph convolutional network for semantic segmentation. arXiv:1909.06121 (2019). doi:http://arxiv.org/abs/1909.06121
Fu, J., Liu, J., Jiang, J., Li, Y., Bao, Y., Lu, H.: Scene segmentation with dual relation‐aware attention network. IEEE Tran. Neural Netw. Learni. Syst. 32(6), 2547-2560 (2020). doi:https://doi.org/10.1109/TNNLS.2020.3006524 DOI: https://doi.org/10.1109/TNNLS.2020.3006524
Li, X., Zhao, H., Han, L., Tong, Y., Yang, K.: GFF: gated fully fusion for semantic segmentation. arXiv:1904.01803 (2019). doi:http://arxiv.org/abs/1904.01803
Yoo, I.: Sementic-segmentation-pytorch: Pytorch implementation of FCN, UNet, PSPNet and various encoder models. https://github.com/IanTaehoonYoo/semantic-segmentation-pytorch (2020). Accessed June 14 2020
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2023 EAI Endorsed Transactions on Industrial Networks and Intelligent Systems
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 3.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.