Enhancing Single-Image Super-Resolution using Patch-Mosaic Data Augmentation on Lightweight Bimodal Network

Authors

DOI:

https://doi.org/10.4108/eetinis.v10i2.2774

Keywords:

single-image super-resolution, data augmentation, vision transformer, CNN

Abstract

With the advancement of deep learning, single-image super-resolution (SISR) has made significant strides. However, most current SISR methods are challenging to employ in real-world applications because they are doubtlessly employed by substantial computational and memory costs caused by complex operations. Furthermore, an efficient dataset is a key factor for bettering model training. The hybrid models of CNN and Vision Transformer can be more efficient in the SISR task. Nevertheless, they require substantial or extremely high-quality datasets for training that could be unavailable from time to time. To tackle these issues, a solution combined by applying a Lightweight Bimodal Network (LBNet) and Patch-Mosaic data augmentation method which is the enhancement of CutMix and YOCO is proposed in this research. With patch-oriented Mosaic data augmentation, an efficient Symmetric CNN is utilized for local feature extraction and coarse image restoration. Plus, a Recursive Transformer aids in fully grasping the long-term dependence of images, enabling the global information to be fully used to refine texture details. Extensive experiments have shown that LBNet with the proposed data augmentation with zero-free additional parameters method outperforms the original LBNet and other state-of-the-art techniques in which image-level data augmentation is applied.

Downloads

Download data is not yet available.

References

Dong C, Loy CC, He K, Tang X. Learning a deep convolutional network for image super-resolution. In: European conference on computer vision. Springer; 2014. p. 184-99. DOI: https://doi.org/10.1007/978-3-319-10593-2_13

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770-8. DOI: https://doi.org/10.1109/CVPR.2016.90

Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. p. 4700-8. DOI: https://doi.org/10.1109/CVPR.2017.243

Kim J, Lee JK, Lee KM. Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 1646-54. DOI: https://doi.org/10.1109/CVPR.2016.182

Lim B, Son S, Kim H, Nah S, Mu Lee K. Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops; 2017. p. 136-44. DOI: https://doi.org/10.1109/CVPRW.2017.151

Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y. Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European conference on computer vision (ECCV); 2018. p. 286-301. DOI: https://doi.org/10.1007/978-3-030-01234-2_18

Kim J, Lee JK, Lee KM. Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 1637-45. DOI: https://doi.org/10.1109/CVPR.2016.181

Tai Y, Yang J, Liu X. Image super-resolution via deep recursive residual network. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. p. 3147-55. DOI: https://doi.org/10.1109/CVPR.2017.298

Ahn N, Kang B, Sohn KA. Fast, accurate, and lightweight super-resolution with cascading residual network. In: Proceedings of the European conference on computer vision (ECCV); 2018. p. 252-68. DOI: https://doi.org/10.1109/CVPRW.2018.00123

Gao G, Li W, Li J, Wu F, Lu H, Yu Y. Feature distillation interaction weighting network for lightweight image super-resolution. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36; 2022. p. 661- 9. DOI: https://doi.org/10.1609/aaai.v36i1.19946

Zhang D, Li C, Xie N, Wang G, Shao J. PFFN: Progressive Feature Fusion Network for Lightweight Image Super-Resolution. In: Proceedings of the 29th ACM International Conference on Multimedia; 2021. p. 3682-90. DOI: https://doi.org/10.1145/3474085.3475650

Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Toronto, Ontario: University of Toronto; 2009. 0.

Kervrann C, Boulanger J. Optimal spatial adaptation for patch-based image denoising. IEEE Transactions on Image Processing. 2006;15(10):2866-78. DOI: https://doi.org/10.1109/TIP.2006.877529

Sivic J, Zisserman A. Video Google: A text retrieval approach to object matching in videos. In: Computer Vision, IEEE International Conference on. vol. 3. IEEE Computer Society; 2003. p. 1470-0. DOI: https://doi.org/10.1109/ICCV.2003.1238663

Csurka G, Dance C, Fan L, Willamowski J, Bray C. Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV. vol. 1. Prague; 2004. p. 1-2.

Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06). vol. 2. IEEE; 2006. p. 2169-78.

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:201011929. 2020.

Bochkovskiy A, Wang CY, Liao HYM. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:200410934. 2020.

Yun S, Han D, Oh SJ, Chun S, Choe J, Yoo Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision; 2019. p. 6023-32. DOI: https://doi.org/10.1109/ICCV.2019.00612

Gao G, Wang Z, Li J, Li W, Yu Y, Zeng T. Lightweight Bimodal Network for Single-Image Super-Resolution via Symmetric CNN and Recursive Transformer. arXiv preprint arXiv:220413286. 2022. DOI: https://doi.org/10.24963/ijcai.2022/128

Li J, Pei Z, Zeng T. From beginner to master: A survey for deep learning-based single-image super-resolution. arXiv preprint arXiv:210914335. 2021.

Hui Z, Gao X, Yang Y, Wang X. Lightweight image super-resolution with information multi-distillation network. In: Proceedings of the 27th acm international conference on multimedia; 2019. p. 2024-32. DOI: https://doi.org/10.1145/3343031.3351084

Lan R, Sun L, Liu Z, Lu H, Pang C, Luo X. MADNet: a fast and lightweight network for single-image super resolution. IEEE transactions on cybernetics. 2020;51(3):1443-53. DOI: https://doi.org/10.1109/TCYB.2020.2970104

Xiao J, Ye Q, Zhao R, Lam KM, Wan K. Self-feature learning: An efficient deep lightweight network for image super-resolution. In: Proceedings of the 29th ACM International Conference on Multimedia; 2021. p. 4408-16. DOI: https://doi.org/10.1145/3474085.3475588

Chen H, Wang Y, Guo T, Xu C, Deng Y, Liu Z, et al. Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition;. p. 12299-310.

Liang J, Cao J, Sun G, Zhang K, Van Gool L, Timofte R. Swinir: Image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021. p. 1833-44. DOI: https://doi.org/10.1109/ICCVW54120.2021.00210

Lu Z, Li J, Liu H, Huang C, Zhang L, Zeng T. Transformer for single image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022. p. 457-66. DOI: https://doi.org/10.1109/CVPRW56347.2022.00061

Efros AA, Leung TK. Texture synthesis by non-parametric sampling. In: Proceedings of the seventh IEEE international conference on computer vision. vol. 2. IEEE; 1999. p. 1033-8. DOI: https://doi.org/10.1109/ICCV.1999.790383

Shocher A, Cohen N, Irani M. “zero-shot” super-resolution using deep internal learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. p. 3118-26. DOI: https://doi.org/10.1109/CVPR.2018.00329

Park T, Efros AA, Zhang R, Zhu JY. Contrastive learning for unpaired image-to-image translation. In: European conference on computer vision. Springer; 2020. p. 319-45. DOI: https://doi.org/10.1007/978-3-030-58545-7_19

Han J, Shoeiby M, Petersson L, Armin MA. Dual contrastive learning for unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. p. 746- 55. DOI: https://doi.org/10.1109/CVPRW53098.2021.00084

Brendel W, Bethge M. Approximating cnns with bag-of-local-features models works surprisingly well on imagenet. arXiv preprint arXiv:190400760. 2019.

Gontijo Lopes R, Yin D, Poole B, Gilmer J, Cubuk ED. Improving robustness without sacrificing accuracy with Patch Gaussian augmentation. arXiv e-prints. 2019:arXiv-1906.

Dwibedi D, Misra I, Hebert M. Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE international conference on computer vision; 2017. p. 1301-10. DOI: https://doi.org/10.1109/ICCV.2017.146

Georgakis G, Mousavian A, Berg AC, Kosecka J. Synthesizing training data for object detection in indoor scenes. arXiv preprint arXiv:170207836. 2017. DOI: https://doi.org/10.15607/RSS.2017.XIII.043

Lin S, Yu T, Feng R, Li X, Jin X, Chen Z. Local patch autoaugment with multi-agent collaboration. arXiv preprint arXiv:210311099. 2021.

Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV. Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:180509501. 2018. DOI: https://doi.org/10.1109/CVPR.2019.00020

Qin Y, Zhang C, Chen T, Lakshminarayanan B, Beutel A, Wang X. Understanding and improving robustness of vision transformers through patch-based negative augmentation. arXiv preprint arXiv:211007858. 2021.

Han J, Fang P, Li W, Hong J, Armin MA, Reid I, et al. You Only Cut Once: Boosting Data Augmentation with a Single Cut. arXiv preprint arXiv:220112078. 2022.

Zhao H, Gallo O, Frosio I, Kautz J. Loss functions for image restoration with neural networks. IEEE Transactions on computational imaging. 2016;3(1):47-57. DOI: https://doi.org/10.1109/TCI.2016.2644865

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in neural information processing systems. 2017;30.

Bevilacqua M, Roumy A, Guillemot C, Alberi Morel ML. Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding. In: British Machine Vision Conference (BMVC). Guildford, Surrey, United Kingdom; 2012. Available from: https://hal. inria.fr/hal-00747054. DOI: https://doi.org/10.5244/C.26.135

Zeyde R, Elad M, Protter M. On single image scale-up using sparse-representations. In: International conference on curves and surfaces. Springer; 2010. p. 711- 30. DOI: https://doi.org/10.1007/978-3-642-27413-8_47

Martin D, Fowlkes C, Tal D, Malik J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001. vol. 2. IEEE; 2001. p. 416-23.

Huang JB, Singh A, Ahuja N. Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 5197-206. DOI: https://doi.org/10.1109/CVPR.2015.7299156

Matsui Y, Ito K, Aramaki Y, Fujimoto A, Ogawa T, Yamasaki T, et al. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications. 2017;76(20):21811-38. DOI: https://doi.org/10.1007/s11042-016-4020-z

Timofte R, Agustsson E, Van Gool L, Yang MH, Zhang L. Ntire 2017 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops; 2017. p. 114-25. DOI: https://doi.org/10.1109/CVPRW.2017.150

Zhang X, Gao P, Liu S, Zhao K, Li G, Yin L, et al. Accurate and efficient image super-resolution via global-local adjusting dense network. IEEE Transactions on Multimedia. 2020;23:1924-37. DOI: https://doi.org/10.1109/TMM.2020.3005025

Downloads

Published

25-05-2023

How to Cite

Nguyen, Q. T., & Quang Hieu, T. (2023). Enhancing Single-Image Super-Resolution using Patch-Mosaic Data Augmentation on Lightweight Bimodal Network. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 10(2), e1. https://doi.org/10.4108/eetinis.v10i2.2774