Enhancing Single-Image Super-Resolution using Patch-Mosaic Data Augmentation on Lightweight Bimodal Network
DOI:
https://doi.org/10.4108/eetinis.v10i2.2774Keywords:
single-image super-resolution, data augmentation, vision transformer, CNNAbstract
With the advancement of deep learning, single-image super-resolution (SISR) has made significant strides. However, most current SISR methods are challenging to employ in real-world applications because they are doubtlessly employed by substantial computational and memory costs caused by complex operations. Furthermore, an efficient dataset is a key factor for bettering model training. The hybrid models of CNN and Vision Transformer can be more efficient in the SISR task. Nevertheless, they require substantial or extremely high-quality datasets for training that could be unavailable from time to time. To tackle these issues, a solution combined by applying a Lightweight Bimodal Network (LBNet) and Patch-Mosaic data augmentation method which is the enhancement of CutMix and YOCO is proposed in this research. With patch-oriented Mosaic data augmentation, an efficient Symmetric CNN is utilized for local feature extraction and coarse image restoration. Plus, a Recursive Transformer aids in fully grasping the long-term dependence of images, enabling the global information to be fully used to refine texture details. Extensive experiments have shown that LBNet with the proposed data augmentation with zero-free additional parameters method outperforms the original LBNet and other state-of-the-art techniques in which image-level data augmentation is applied.
Downloads
References
Dong C, Loy CC, He K, Tang X. Learning a deep convolutional network for image super-resolution. In: European conference on computer vision. Springer; 2014. p. 184-99. DOI: https://doi.org/10.1007/978-3-319-10593-2_13
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770-8. DOI: https://doi.org/10.1109/CVPR.2016.90
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. p. 4700-8. DOI: https://doi.org/10.1109/CVPR.2017.243
Kim J, Lee JK, Lee KM. Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 1646-54. DOI: https://doi.org/10.1109/CVPR.2016.182
Lim B, Son S, Kim H, Nah S, Mu Lee K. Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops; 2017. p. 136-44. DOI: https://doi.org/10.1109/CVPRW.2017.151
Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y. Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European conference on computer vision (ECCV); 2018. p. 286-301. DOI: https://doi.org/10.1007/978-3-030-01234-2_18
Kim J, Lee JK, Lee KM. Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 1637-45. DOI: https://doi.org/10.1109/CVPR.2016.181
Tai Y, Yang J, Liu X. Image super-resolution via deep recursive residual network. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. p. 3147-55. DOI: https://doi.org/10.1109/CVPR.2017.298
Ahn N, Kang B, Sohn KA. Fast, accurate, and lightweight super-resolution with cascading residual network. In: Proceedings of the European conference on computer vision (ECCV); 2018. p. 252-68. DOI: https://doi.org/10.1109/CVPRW.2018.00123
Gao G, Li W, Li J, Wu F, Lu H, Yu Y. Feature distillation interaction weighting network for lightweight image super-resolution. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36; 2022. p. 661- 9. DOI: https://doi.org/10.1609/aaai.v36i1.19946
Zhang D, Li C, Xie N, Wang G, Shao J. PFFN: Progressive Feature Fusion Network for Lightweight Image Super-Resolution. In: Proceedings of the 29th ACM International Conference on Multimedia; 2021. p. 3682-90. DOI: https://doi.org/10.1145/3474085.3475650
Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Toronto, Ontario: University of Toronto; 2009. 0.
Kervrann C, Boulanger J. Optimal spatial adaptation for patch-based image denoising. IEEE Transactions on Image Processing. 2006;15(10):2866-78. DOI: https://doi.org/10.1109/TIP.2006.877529
Sivic J, Zisserman A. Video Google: A text retrieval approach to object matching in videos. In: Computer Vision, IEEE International Conference on. vol. 3. IEEE Computer Society; 2003. p. 1470-0. DOI: https://doi.org/10.1109/ICCV.2003.1238663
Csurka G, Dance C, Fan L, Willamowski J, Bray C. Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV. vol. 1. Prague; 2004. p. 1-2.
Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06). vol. 2. IEEE; 2006. p. 2169-78.
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:201011929. 2020.
Bochkovskiy A, Wang CY, Liao HYM. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:200410934. 2020.
Yun S, Han D, Oh SJ, Chun S, Choe J, Yoo Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision; 2019. p. 6023-32. DOI: https://doi.org/10.1109/ICCV.2019.00612
Gao G, Wang Z, Li J, Li W, Yu Y, Zeng T. Lightweight Bimodal Network for Single-Image Super-Resolution via Symmetric CNN and Recursive Transformer. arXiv preprint arXiv:220413286. 2022. DOI: https://doi.org/10.24963/ijcai.2022/128
Li J, Pei Z, Zeng T. From beginner to master: A survey for deep learning-based single-image super-resolution. arXiv preprint arXiv:210914335. 2021.
Hui Z, Gao X, Yang Y, Wang X. Lightweight image super-resolution with information multi-distillation network. In: Proceedings of the 27th acm international conference on multimedia; 2019. p. 2024-32. DOI: https://doi.org/10.1145/3343031.3351084
Lan R, Sun L, Liu Z, Lu H, Pang C, Luo X. MADNet: a fast and lightweight network for single-image super resolution. IEEE transactions on cybernetics. 2020;51(3):1443-53. DOI: https://doi.org/10.1109/TCYB.2020.2970104
Xiao J, Ye Q, Zhao R, Lam KM, Wan K. Self-feature learning: An efficient deep lightweight network for image super-resolution. In: Proceedings of the 29th ACM International Conference on Multimedia; 2021. p. 4408-16. DOI: https://doi.org/10.1145/3474085.3475588
Chen H, Wang Y, Guo T, Xu C, Deng Y, Liu Z, et al. Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition;. p. 12299-310.
Liang J, Cao J, Sun G, Zhang K, Van Gool L, Timofte R. Swinir: Image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021. p. 1833-44. DOI: https://doi.org/10.1109/ICCVW54120.2021.00210
Lu Z, Li J, Liu H, Huang C, Zhang L, Zeng T. Transformer for single image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022. p. 457-66. DOI: https://doi.org/10.1109/CVPRW56347.2022.00061
Efros AA, Leung TK. Texture synthesis by non-parametric sampling. In: Proceedings of the seventh IEEE international conference on computer vision. vol. 2. IEEE; 1999. p. 1033-8. DOI: https://doi.org/10.1109/ICCV.1999.790383
Shocher A, Cohen N, Irani M. “zero-shot” super-resolution using deep internal learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. p. 3118-26. DOI: https://doi.org/10.1109/CVPR.2018.00329
Park T, Efros AA, Zhang R, Zhu JY. Contrastive learning for unpaired image-to-image translation. In: European conference on computer vision. Springer; 2020. p. 319-45. DOI: https://doi.org/10.1007/978-3-030-58545-7_19
Han J, Shoeiby M, Petersson L, Armin MA. Dual contrastive learning for unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. p. 746- 55. DOI: https://doi.org/10.1109/CVPRW53098.2021.00084
Brendel W, Bethge M. Approximating cnns with bag-of-local-features models works surprisingly well on imagenet. arXiv preprint arXiv:190400760. 2019.
Gontijo Lopes R, Yin D, Poole B, Gilmer J, Cubuk ED. Improving robustness without sacrificing accuracy with Patch Gaussian augmentation. arXiv e-prints. 2019:arXiv-1906.
Dwibedi D, Misra I, Hebert M. Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE international conference on computer vision; 2017. p. 1301-10. DOI: https://doi.org/10.1109/ICCV.2017.146
Georgakis G, Mousavian A, Berg AC, Kosecka J. Synthesizing training data for object detection in indoor scenes. arXiv preprint arXiv:170207836. 2017. DOI: https://doi.org/10.15607/RSS.2017.XIII.043
Lin S, Yu T, Feng R, Li X, Jin X, Chen Z. Local patch autoaugment with multi-agent collaboration. arXiv preprint arXiv:210311099. 2021.
Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV. Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:180509501. 2018. DOI: https://doi.org/10.1109/CVPR.2019.00020
Qin Y, Zhang C, Chen T, Lakshminarayanan B, Beutel A, Wang X. Understanding and improving robustness of vision transformers through patch-based negative augmentation. arXiv preprint arXiv:211007858. 2021.
Han J, Fang P, Li W, Hong J, Armin MA, Reid I, et al. You Only Cut Once: Boosting Data Augmentation with a Single Cut. arXiv preprint arXiv:220112078. 2022.
Zhao H, Gallo O, Frosio I, Kautz J. Loss functions for image restoration with neural networks. IEEE Transactions on computational imaging. 2016;3(1):47-57. DOI: https://doi.org/10.1109/TCI.2016.2644865
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in neural information processing systems. 2017;30.
Bevilacqua M, Roumy A, Guillemot C, Alberi Morel ML. Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding. In: British Machine Vision Conference (BMVC). Guildford, Surrey, United Kingdom; 2012. Available from: https://hal. inria.fr/hal-00747054. DOI: https://doi.org/10.5244/C.26.135
Zeyde R, Elad M, Protter M. On single image scale-up using sparse-representations. In: International conference on curves and surfaces. Springer; 2010. p. 711- 30. DOI: https://doi.org/10.1007/978-3-642-27413-8_47
Martin D, Fowlkes C, Tal D, Malik J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001. vol. 2. IEEE; 2001. p. 416-23.
Huang JB, Singh A, Ahuja N. Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 5197-206. DOI: https://doi.org/10.1109/CVPR.2015.7299156
Matsui Y, Ito K, Aramaki Y, Fujimoto A, Ogawa T, Yamasaki T, et al. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications. 2017;76(20):21811-38. DOI: https://doi.org/10.1007/s11042-016-4020-z
Timofte R, Agustsson E, Van Gool L, Yang MH, Zhang L. Ntire 2017 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops; 2017. p. 114-25. DOI: https://doi.org/10.1109/CVPRW.2017.150
Zhang X, Gao P, Liu S, Zhao K, Li G, Yin L, et al. Accurate and efficient image super-resolution via global-local adjusting dense network. IEEE Transactions on Multimedia. 2020;23:1924-37. DOI: https://doi.org/10.1109/TMM.2020.3005025
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 EAI Endorsed Transactions on Industrial Networks and Intelligent Systems
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 3.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.