Financial Fraud: Identifying Corporate Tax Report Fraud Under the Xgboost Algorithm

Authors

DOI:

https://doi.org/10.4108/eetsis.v10i3.3033

Keywords:

financial fraud, corporate tax, falsification identification, XGBoost algorithm

Abstract

INTRODUCTION: With the development of economy, the phenomenon of financial fraud has become more and more frequent.

OBJECTIVES: This paper aims to study the identification of corporate tax report falsification.

METHODS: Firstly, financial fraud was briefly introduced; then, samples were selected from CSMAR database, 18 indicators related to fraud were selected from corporate tax reports, and 13 indicators were retained after information screening; finally, the XGBoost algorithm was used to recognize tax report falsification.

RESULTS: The XGBoost algorithm had the highest accuracy rate (94.55%) when identifying corporate tax statement falsification, and the accuracy of the other algorithms such as the Logistic regressive algorithm were below 90%; the F1 value of the XGBoost algorithm was also high, reaching 90.1%; it also had the shortest running time (55 s).

CONCLUSION: The results prove the reliability of the XGBoost algorithm in the identification of corporate tax report falsification. It can be applied in practice.

References

Wang D, Lin J, Cui P, Jia Q, Wang Z, Fang Y, Yu Q, Zhou J, Yang S, Qi Y. A Semi-Supervised Graph Attentive Network for Financial Fraud Detection. 2019 IEEE International Conference on Data Mining (ICDM); 2019. p. 598-607.

Cheng C H, Kao Y F, Lin H P. A financial statement fraud model based on synthesized attribute selection and a dataset with missing values and imbalanced classes. Appl. Soft Comput., 2021; 108(3):1-19.

Heneke E, Valentine R, Jourdan Z. Predictive Factors in Financial Fraud and Malfeasance from 1950-2018. J. Bus. Econ. Perspect., 2021; 48(1):1-21.

Voznyak H V. Financial Fraud in the Budget Sphere: Economic Essence and Varieties. Bus. Inform, 2020; 4(507):334-339.

Wu H, Chang Y, Li J, Zhu X. Financial fraud risk analysis based on audit information knowledge graph. Proc. Comput. Sci., 2022; 199:780-787.

Hppner S, Baesens B, Verbeke W, Verdonck T. Instance-dependent cost-sensitive learning for detecting transfer fraud. Eur. J. Oper. Res., 2022; 297(1):291-300.

Hilal W, Gadsden S A, Yawney J. A Review of Anomaly Detection Techniques and Applications in Financial Fraud. Expert Syst. Appl., 2021; 193(8):1-34.

Jain A, Shinde S. A Comprehensive Study of Data Mining-based Financial Fraud Detection Research. 2019 IEEE 5th International Conference for Convergence in Technology (I2CT); 29-31 March 2019; Bombay, India. New York: IEEE; 2019. p. 1-4.

Humpherys S L, Moffitt K C, Burns M B, Burgoon JK, Felix WF. Identification of fraudulent financial statements using linguistic credibility analysis. Decis. Support Syst., 2011; 50(3):585-594.

Houssou R, Bovay J, Robert S. Adaptive Financial Fraud Detection in Imbalanced Data with Time-Varying Poisson Processes. J. Financ. Risk Manag., 2019; 08(4):286-304.

Wen S, Li J, Zhu X, Liu M. Analysis of financial fraud based on manager knowledge graph. Proc. Comput. Sci., 2022; 199:773-779.

Akra R M, Chaya J K. Testing the Effectiveness of Altman and Beneish Models in Detecting Financial Fraud and Financial Manipulation: Case Study Kuwaiti Stock. Int. J. Bus. Manag., 2020; 15(10):1-70.

Zhou H, Sun G, Fu S, Fan X, Jiang W, Hu S, Li L. A Distributed Approach of Big Data Mining for Financial Fraud Detection in a Supply Chain. Comput. Mater. Con., 2020; 64(2):1091-1105.

Burke J, Kieffer C, Mottola G, Perez-Arce F. Can educational interventions reduce susceptibility to financial fraud?. J. Econ. Behav. Organ., 2022; 198(Jun):250-266.

Davidson R H. Who did it matters: Executive equity compensation and financial reporting fraud. J. Account. Econ., 2022(2/3):73.

Achmad T, Ghozali I, Pamungkas I D. Hexagon Fraud: Detection of Fraudulent Financial Reporting in State-Owned Enterprises Indonesia. Economies, 2022; 10(1):1-16.

Novatiani R A, Afiah N N, Sumantri R. Risk Management and other Factors Preventing Fraudulent Financial Reporting by State-Owned Enterprises in Indonesia. Asian Econ. Financ. Rev., 2022; 12(8):686-711.

Kumar A, Mishra G S, Nand P, Chahar MS, Mahto SK. Financial Fraud Detection in Plastic Payment Cards using Isolation Forest Algorithm. Int. J. Innov. Technol. Explor. Eng., 2021; 10(8):132-136.

Zhang J, Yao J, Wang L, Chen Y, Pan Y. A Financial Fraud Detection Model Based on Organizational Impression Management Strategy. J. Phys. Conf. Ser., 2020; 1616: 1-11.

Amina Z. Financial Fraud Detection and the Importance of Internal Control. Int. J. Account. Financ. Rep., 2021; 11(4):28-36.

Yadav A, Sora M. Fraud Detection in Financial Statements using Text Mining Methods: A Review. IOP Conf. Ser. Mater. Sci. Eng., 2021; 1020(1):1-9.

Furui K, Ohue M. Compound Virtual Screening by Learning-to-Rank with Gradient Boosting Decision Tree and Enrichment-based Cumulative Gain. 2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB); 15-17 August 2022; Ottawa, ON, Canada. New York: IEEE; 2022. p. 1-7.

Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining;t 13-17 Augus, 2016; San Francisco California USA. New York, NY, United States: Association for Computing Machinery. 2016. p. 785-794.

Millán O, Domenech LQ, Colom H, Fortuna V, Budde K, Sommerer C, López-Púa Y, Brunet M. Early prognostic performance of miR155-5p monitoring for the risk of rejection: Logistic regression with a population pharmacokinetic approach in adult kidney transplant patients. PLoS ONE, 2021; 16(1):1-20.

Bernardo L S, Damasevicius R, de Albuquerque V, Maskeliunas R. A hybrid two-stage squeezenet and support vector machine system for parkinson's disease detection based on handwritten spiral patterns. Int. J. Ap. Mat. Com.-Pol, 2021; 31(4):549-561.

Shanmugarajeshwari V, Ilayaraja M. Chronic Kidney Disease for Collaborative Healthcare Data Analytics using Random Forest Classification Algorithms. 2021 International Conference on Computer Communication and Informatics (ICCCI); 27-29 January 2021; Coimbatore, India. New York: IEEE; 2021. p. 1-14.

Ma J, Sun L, Wang H, Zhang Y, Aickelin U. Supervised Anomaly Detection in Uncertain Pseudoperiodic Data Streams. ACM T. Internet Techn., 2016; 16(1):1-20.

Hilal W, Gadsden S A, Yawney J. Financial Fraud: A Review of Anomaly Detection Techniques and Recent Advances. Expert Syst. Appl., 2022; 193:1-34.

Zhu X, Ao X, Qin Z, Chang Y, Liu Y, He Q, Li J. Intelligent financial fraud detection practices in post-pandemic era. Innovation, 2021; 2(4):1-11.

Bahaweres R B, Trawally J, Hermadi I, Suroso AI. Forensic Audit Using Process Mining to Detect Fraud. J. Phys. Conf. Ser., 2021; 1779(1):1-10.

Qiu S, Luo Y, Guo H. Multisource Evidence Theory-based Fraud Risk Assessment of China's Listed Companies. J. Forecast., 2021; 40(8):1524-1539.

Xia H, Ma H. A Novel Structure-based Feature Extraction Approach for Financial Fraud Detection. J. Phys. Conf. Ser., 2021; 1865(4):1-7.

Jan C L. Detection of Financial Statement Fraud Using Deep Learning for Sustainable Development of Capital Markets under Information Asymmetry. Sustainability, 2021; 13(17):1-20.

Downloads

Published

05-05-2023

How to Cite

1.
Li X. Financial Fraud: Identifying Corporate Tax Report Fraud Under the Xgboost Algorithm. EAI Endorsed Scal Inf Syst [Internet]. 2023 May 5 [cited 2024 Dec. 25];10(4):e10. Available from: https://publications.eai.eu/index.php/sis/article/view/3033