A Fully Convolutional Network with Waterfall Atrous Spatial Pooling and Localized Active Contour Loss for Fish Segmentation

Thanh Viet Le; Van Yem Vu; Van Truong Pham; Thi-Thao Tran

doi:10.4108/eetinis.v10i1.2942

Authors

Thanh Viet Le Hanoi University of Science and Technology
Van Yem Vu Hanoi University of Science and Technology
Van Truong Pham Hanoi University of Science and Technology
Thi-Thao Tran Hanoi University of Science and Technology

DOI:

https://doi.org/10.4108/eetinis.v10i1.2942

Keywords:

Fully convolutional network, DeepFish, SIUM fish data, Waterfall Atrous Spatial Pooling

Abstract

Accurate measurements and statistics of fish data are important for sustainable development of aqua-enviroment and marine fisheries. For data measurements and statistics, automatic segmentation of fish is one of key tasks. The fish segmentation however is a challenging task due to arterfacts in underwater images. In this study, we introduce a deep-learning approach, namely FCN-WRN-WASP for automatic fish segmentation from the underwater images. In particular, we introduce a computational-efficient variation called Waterfall Atrous Spatial Pooling (WASP) module into a Fully convolutional network with Wide ResNet baseline. We also proposed a loss function inspired from active contour approach that can exploit the local intensity information from the input image. The approach has been validated on the DeepFish data and the SIUM data set. The results are promissing for fish segmentation, with higher Intersection over Union (IoU) scores compared to state of the arts. The evaluation results showed that the incorporation of the image based active contour loss helps increase the segmentation performance. In addition, the use of the WASP in the architecture is effective especially for forground fish segmentation.

Downloads

References

Hussain, M.A., Saputra, T., Szabo, E.A., Nelan, B.: An overview of seafood supply, food safety and regulation in New South Wales, Australia. Foods 6(7), 52 (2017). doi:https ://doi.org/10.3390/foods DOI: https://doi.org/10.3390/foods6070052

Saleh, A., H. Laradji, I., A. Konovalov, D., Bradley, M., Vazquez, D., Sheaves , M.: A Realistic Fish Habitat Dataset to Evaluate Algorithms for Underwater Visual Analysis. Scientific Reports 10 Article number: 14671 (2020). doi:DOI: 10.1038/s41598-020-71639-x DOI: https://doi.org/10.1038/s41598-020-71639-x

Delgado, C., Wada, N., Rosegrant, M., Meijer, S., Ahmed, M.: Fish to 2020: Supply and demand in changing global markets. World Fish Center Technical Report 62 (2003).

Yang, L., Liu, Y., Yu, H., Fang, X., Song, L., Daoliang, L., Chen, Y.: Computer Vision Models in Intelligent Aquaculture with Emphasis on Fish Detection and Behavior Analysis: A Review. Archives of Computational Methods in Engineering 28, 2785–2816 (2021). DOI: https://doi.org/10.1007/s11831-020-09486-2

Ahmed, M.N., Yamany, S.M., Mohamed, N., Farag , A.A., Moriarty, T.: A modified fuzzy C-means algorithm for bias field estimation and segmentation of MRI data. IEEE Trans. Med. Imaging 21(3), 193-199 (2002). DOI: https://doi.org/10.1109/42.996338

Tran, T.T., Pham, V.T., Shyu, K.K.: Image segmentation using fuzzy energy-based active contour with shape prior. J. Vis. Commun. Image Represent. 25(7), 1732-1745 (2014). DOI: https://doi.org/10.1016/j.jvcir.2014.06.006

J. Long, E. Shelhamer, T. Darrell: Fully convolutional networks for semantic segmentation. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3431–3440 (2015). DOI: https://doi.org/10.1109/CVPR.2015.7298965

He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, pp. 770-778

Artacho, B., Savakis. A.: Waterfall atrous spatial pooling architecture for efficient semantic segmentation. Sensors 19(4), 5661 (2019). doi:https://doi.org/10.3390/s19245361 DOI: https://doi.org/10.3390/s19245361

Islam, M.J., Edge, C., Xiao, Y., Luo, P., Mehtaz, M., Morse, C., Enan, S., Sattar, J.: Semantic segmentation of underwater imagery: dataset and benchmark. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020, pp. 1769–1776 DOI: https://doi.org/10.1109/IROS45743.2020.9340821

O'Shea, K., Nash, R.: An Introduction to Convolutional Neural Networks. arXiv:1511.08458 (2015).

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. 2015, pp. 234-241 DOI: https://doi.org/10.1007/978-3-319-24574-4_28

Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid Scene Parsing Network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017 DOI: https://doi.org/10.1109/CVPR.2017.660

Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., L. Yuille, A.: DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(4), 834 - 848 (2018). DOI: https://doi.org/10.1109/TPAMI.2017.2699184

He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016 DOI: https://doi.org/10.1109/CVPR.2016.90

Zagoruyko, S., Komodakis, N.: Wide Residual Networks. In: Proceedings of the British Machine Vision Conference (BMVC) 2016, pp. 87.81-87.12 DOI: https://doi.org/10.5244/C.30.87

Shorya, S.: Semantic Segmentation for Urban-Scene Images. arXiv:2110.13813 (2021 ). doi:https://doi.org/10.48550/arXiv.2110.13813

Baevski, A., Auli, M.: Adaptive input representations for neural language modeling. In: The International Conference on Learning Representations (ICLR) 2019

Fowlkes, C., Martin, D., Malik, J.: Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2003, pp. II-54

Krähenbühl, P., Koltun, V.: Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. In: Advances in Neural Information Processing Systems 24 2012, pp. 109-117

Li, C., Kao, C.Y., C. Gore, J., Ding, Z.: Minimization of Region-Scalable Fitting Energy for Image Segmentation. IEEE Transactions on Image Processing 17(10), 1940 - 1949 (2008). DOI: https://doi.org/10.1109/TIP.2008.2002304

Lankton, S., Tannenbaum, A.: Localizing Region-Based Active Contours. IEEE Transactions on Image Processing 17(11), 2029 - 2039 (2008). DOI: https://doi.org/10.1109/TIP.2008.2004611

Chen, X., M. Williams, B., R. Vallabhaneni, S., Czanner, G., Williams, R., Zheng, Y.: Learning Active Contour Models for Medical Image Segmentation. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 11623-11640 (2019). DOI: https://doi.org/10.1109/CVPR.2019.01190

Zhang, W., Wu, C., Bao, Z.: DPANet: Dual Pooling‐aggregated Attention Network for fish segmentation. IET computer vision, 67-82 (2021). DOI: https://doi.org/10.1049/cvi2.12065

Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017). DOI: https://doi.org/10.1109/TPAMI.2016.2644615

Chen, L., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017). doi: http://arxiv.org/abs/1706.05587

Zhang, L., Li, X., Arnab, A., Yang, K., Tong, Y., Torr, P.: Dual graph convolutional network for semantic segmentation. arXiv:1909.06121 (2019). doi:http://arxiv.org/abs/1909.06121

Fu, J., Liu, J., Jiang, J., Li, Y., Bao, Y., Lu, H.: Scene segmentation with dual relation‐aware attention network. IEEE Tran. Neural Netw. Learni. Syst. 32(6), 2547-2560 (2020). doi:https://doi.org/10.1109/TNNLS.2020.3006524 DOI: https://doi.org/10.1109/TNNLS.2020.3006524

Li, X., Zhao, H., Han, L., Tong, Y., Yang, K.: GFF: gated fully fusion for semantic segmentation. arXiv:1904.01803 (2019). doi:http://arxiv.org/abs/1904.01803

Yoo, I.: Sementic-segmentation-pytorch: Pytorch implementation of FCN, UNet, PSPNet and various encoder models. https://github.com/IanTaehoonYoo/semantic-segmentation-pytorch (2020). Accessed June 14 2020