A New Deepfake Detection Method Based on Compound Scaling Dual-Stream Attention Network
DOI:
https://doi.org/10.4108/eetpht.10.5912Keywords:
Deepfake detection, compound scaling, channel attention, self-attention, swin transformerAbstract
INTRODUCTION: Deepfake technology allows for the overlaying of existing images or videos onto target images or videos. The misuse of this technology has led to increasing complexity in information dissemination on the internet, causing harm to personal and societal public interests.
OBJECTIVES: To reduce the impact and harm of deepfake as much as possible, an efficient deepfake detection method is needed.
METHODS: This paper proposes a deepfake detection method based on a compound scaling dual-stream attention network, which combines a compound scaling module and a dual-stream attention module based on Swin Transformer to detect deepfake videos. In architectural design, we utilize the compound scaling module to extract shallowlevel features from the images and feed them into the deep-level feature extraction layer based on the dual-stream attention module. Finally, the obtained features are passed through a fully connected layer for classification, resulting in the detection outcome.
RESULTS: Experiments on the FF++ dataset demonstrate that the deepfake detection accuracy is 95.62%, which shows its superiority to some extent.
CONCLUSION: The method proposed in this paper is feasible and can be used to detect deepfake videos or images.
Downloads
References
Nguyen, X.H., Tran, T.S., Nguyen, K.D., et al. Learning spatio-temporal features to detect manipulated facial videos created by the deepfake techniques, Forensic Science International: Digital Investigation, 2021, 36: 301108.
Westerlund, M. The emergence of deepfake technology: A review, Technology innovation management review, 2019, 9(11): 39-52.
Pantserev, K.A. The malicious use of AI-based deepfake technology as the new threat to psychological security and political stability, Cyber defence in the age of AI, smart societies and augmented humanity, 2020: 37-55.
Jones, V.A. Artificial intelligence enabled deepfake technology: the emergence of a new threat, PhD thesis, Utica College, 2020.
Neethirajan, S. Is seeing still believing? Leveraging deepfake technology for livestock farming, Frontiers in Veterinary Science, 2021, 8: 740253.
Pan, D., Sun, L., Wang, R., et al. Deepfake detection through deep learning, Proceedings of the 2020 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT), 2020: 134-143.
Deshmukh, A., Wankhade, S.B. Deepfake detection approaches using deep learning: a systematic review, Lecture Notes in Networks and Systems, 2020, 146: 293-302.
Chadha, A., Kumar, V., Kashyap, S., et al. Deepfake: an overview, Proceedings of Second International Conference on Computing, Communications, and Cyber-Security, 2021: 557-566.
Maksutov, A.A., Morozov, V.O., Lavrenov, A.A., et al. Methods of deepfake detection based on machine learning, Proceedings of the 2020 IEEE conference of Russian young researchers in electrical and electronic engineering, 2020: 408-411.
Nguyen, T.T., Nguyen, Q.V.H., Nguyen, D.T., et al. Deep learning for deepfakes creation and detection: A survey, Computer Vision and Image Understanding, 2022, 223: 103525.
Zhou, L.J., Ma, C., Wang, Z.P., et al. Robust Frame-Level Detection for Deepfake Videos With Lightweight Bayesian Inference Weighting, IEEE Internet of Things Journal, 2023, 11(7): 13018-13028.
Yadav, A., Vishwakarma, D.K. AW-MSA: Adaptively weighted multi-scale attentional features for DeepFake detection, Engineering Applications of Artificial Intelligence, 2024, 127: 107443.
Liu, Z., Lin, Y., Cao, Y., et al. Swin Transformer: hierarchical vision transformer using shifted windows, Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 10012-10022.
Juefei-Xu, F., Wang, R., Huang, Y., et al. Countering malicious deepfakes: Survey, battleground, and horizon, International journal of Computer Vision, 2022, 130(7): 1678-1734.
Tian, X., Lingyun, Y., Changwei, L., et al. Survey of deep face manipulation and fake detection, Journal of Tsinghua University (Science and Technology), 2023, 63(9): 1350–1365.
Akhtar, Z. Deepfakes Generation and Detection: A Short Survey, Journal of Imaging, 2023, 9(1): 18.
Mirsky, Y. and Lee, W. The creation and detection of deepfakes: A survey, ACM Computing Surveys, 2021, 54(1): 1-41.
Zhou, X. and Zafarani, R. A survey of fake news: fundamental theories, detection methods, and opportunities, ACM Computing Surveys, 2020, 53(5): 1-40.
Korshunova, I., Shi, W., Dambre, J., et al. Fast face-swap using convolutional neural networks, Proceedings of the IEEE International Conference on Computer Vision, 2017: 3677–3685.
Liu, K., Perov, I., Gao, D., et al. Deepfacelab: Integrated, flexible and extensible face-swapping framework, Pattern Recognition, 2023, 141: 109628.
Zhu, J.Y., Park, T., Isola, P., et al. Unpaired image-to-image translation using cycle-consistent adversarial networks, Proceedings of the IEEE International Conference on Computer Vision, 2017: 2223–2232.
Lin, B.S., Hsu, D.W., Shen, C.H., et al. Using fully connected and convolutional net for GAN-based face swapping, Proceedings of the 2020 IEEE Asia Pacific Conference on Circuits and Systems, 2020: 185–188.
Nirkin, Y., Keller, Y., Hassner, T. Fsgan: Subject agnostic face swapping and reenactment, Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 7184–7193.
Natsume, R., Yatagawa, T., Morishima, S. Rsgan: face swapping and editing using face and hair representation in latent spaces, Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2018: 1–2.
Zhou, H., Liu, Y., Liu, Z., et al. Talking face generation by adversarially disentangled audio-visual representation, Proceedings of the AAAI conference on Artificial Intelligence, 2019: 9299–9306.
Li, L., Bao, J., Yang, H., et al. Advancing high fidelity identity swapping for forgery detection, Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, 2020: 5074–5083.
Chen, R., Chen, X., Ni, B., et al. Simswap: An efficient framework for high fidelity face swapping, Proceedings of the 28th ACM International Conference on Multimedia, 2020: 2003–2011.
Verdoliva, L. Media forensics and deepfakes: an overview, IEEE Journal of Selected Topics in Signal Processing, 2020, 14(5): 910–932.
Choi, Y., Choi, M., Kim, M., et al. Stargan: unified generative adversarial networks for multi-domain image-to-image translation, Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018: 8789–8797.
He, Z., Zuo, W., Kan, M., et al. Attgan: Facial attribute editing by only changing what you want, IEEE transactions on image processing, 2019, 28(11): 5464–5478.
Marra, F., Gragnaniello, D., Cozzolino, D., et al. Detection of gan-generated fake images over social networks, Proceedings of the 2018 IEEE conference on multimedia information processing and retrieval, 2018: 384–389.
Li, H., Li, B., Tan, S., et al. Detection of deep network generated images using disparities in color components, arXiv preprint, 2018: 1–26.
Akhtar, Z., Mouree, M.R., Dasgupta, D. Utility of deep learning features for facial attributes manipulation detection, Proceedings of the 2020 IEEE International Conference on Humanized Computing and Communication with Artificial Intelligence, 2020: 55–60.
Du, C.X.T., Trung, H.T., Tam, P.M. Efficient-frequency: a hybrid visual forensic framework for facial forgery detection, Proceedings of the 2020 IEEE symposium series on Computational Intelligencee, 2020: 707–712.
Ak, K.E., Lim, J.H., Tham, J.Y., et al. Efficient-frequency: a hybrid visual forensic framework for facial forgery detection, Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 10541–10550.
Liu, M.Y. and Tuzel, O. Coupled generative adversarial networks, Advances in neural information processing systems, 2016, 29: 1-9.
Kingma, D.P. and Dhariwal, P. Glow: generative flow with invertible 1×1 convolutions, Advances in neural information processing systems, 2018, 31: 1-10.
Tolosana, R., Vera-Rodriguez, R., Fierrez, J., et al. Deepfakes and beyond: A survey of face manipulation and fake detection, Information Fusion, 2020, 64: 131–148.
Lyu, S. Deepfake detection: Current challenges and next steps, Proceedings of the 2020 IEEE international conference on multimedia & expo workshops, 2020: 1–6.
Karras, T., Aila, T., Laine, S., et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation, Proceedings of the International Conference on Learning Representations, 2018: 1–26.
Karras, T., Laine, S., Aila, T. A style-based generator architecture for generative adversarial networks, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 4401–4410.
Hsu, C.C., Zhuang, Y.X., Lee, C.Y. Deep fake image detection based on pairwise learning, Applied Sciences, 2020, 10(1): 370.
Marra, F., Gragnaniello, D., Verdoliva, L. Do gans leave artificial fingerprints?, Proceedings of the 2019 IEEE conference on multimedia information processing and retrieval, 2019: 506–511.
Tan, M. and Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks, Proceedings of the International Conference on Machine Learning, 2019: 6105–6114.
He, K., Zhang, X., Ren, S., et al. Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770–778.
He, T., Zhang, Z., Zhang, H., et al. Bag of tricks for image classification with convolutional neural networks, Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition, 2019: 558–567.
Tan, D.X., Le, Q. EfficientNetV2: Smaller models and faster training, International conference on machine learning, 2021: 10096–10106.
Liang, S., Liu, R.H. and Qian, J.S. Fast saliency prediction based on multi-channels activation optimization, Journal of Visual Communication and Image Representation, 2023, 94: 103831.
Wang, F., Jiang, M., Qian, C., et al. Residual attention network for image classification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3156–3164.
Rossler, A., Cozzolino, D., Verdoliva, L., et al. Faceforensics++: Learning to detect manipulated facial images, Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 1–11.
https://github.com/deepfakes/faceswap.
Thies, J., Zollhofer, M., Stamminger, M., et al. Face2face: Real-time face capture and reenactment of rgb videos, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2387–2395.
https://github.com/MarekKowalski/FaceSwap/.
Thies, J., Zollhöfer, M., Nießner, M. Deferred neural rendering: image synthesis using neural textures, Acm Transactions on Graphics, 2019, 38(4): 1–12.
Fridrich, J. and Kodovsky, J. Rich models for steganalysis of digital images, IEEE Transactions on information Forensics and Security, 2012, 7(3): 868–882.
Fridrich, J. and Kodovsky, J. Recasting residual-based local descriptors as convolutional neural networks: an application to image forgery detection, Proceedings of the 5th ACM workshop on information hiding and multimedia security, 2017: 159–164.
Bayar, B. and Stamm, M.C. A deep learning approach to universal image manipulation detection using a new convolutional layer, Proceedings of the 4th ACM workshop on information hiding and multimedia security, 2016: 5–10.
Afchar, D., Nozick, V., Yamagishi, J., et al. Mesonet: a compact facial video forgery detection network’, Proceedings of the 2018 IEEE international workshop on information forensics and security, 2018: 1–7.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Shuya Wang, Chenjun Du, Yunfang Chen
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.