Classification Algorithms for Liver Epidemic Identification


  • Koteswara Rao Makkena Vellore Institute of Technology University image/svg+xml
  • Karthika Natarajan Vellore Institute of Technology University image/svg+xml



medical care, liver epidemic, prognosis, classification models, Sythetic Minority Over Sampling Technique


Situated in the upper right region of the abdomen, beneath the diaphragm and above the stomach, lies the liver. It is a crucial organ essential for the proper functioning of the body.  The principal tasks are to eliminate generated waste produced by our organs, and digestive food and preserve vitamins and energy materials. It performs many important functions in the body, it regulates the balance of hormones in the body filtering and removing bacteria, viruses, and other harmful substances from the blood. In certain dire circumstances, the outcome can unfortunately result in fatality. There exist numerous classifications of liver diseases, based on their causes or distinguishing characteristics. Some common categories of liver disease include Viral hepatitis, Autoimmune liver disease, Metabolic liver disease, Alcohol-related liver disease, Non-alcoholic fatty liver disease, Genetic liver disease, Drug-induced liver injury, Biliary tract disorders. Machine learning algorithms can help identify patterns and risk factors that may be difficult for humans to detect. With this clinicians can enable early diagnosis of diseases, leading to better treatment outcomes and improved patient care. In this research work, different types of machine learning methods are implemented and compared in terms of performance metrics to identify whether a person effected or not. The algorithms used here for predicting liver patients are Random Forest classifier, K-nearest neighbor, XGBoost, Decision tree, Logistic Regression, support vector machine, Extra Trees Classifier. The experimental results showed that the accuracy of various machine learning models-Random Forest classifier-67.4%, K-nearest neighbor-54.8%, XGBoost-72%, Decision tree-65.1%, Logistic Regression-68.0%, support vector machine-65.1%, Extra Trees Classifier-70.2% after applying Synthetic Minority Over-sampling technique.


Download data is not yet available.


Arias, I.M.; Alter, H.J.; Boyer, J.L.; Cohen, D.E.; Shafritz, D.A.; Thorgeirsson, S.S.; Wolkoff, A.W. The Liver: Biology and Pathobiology;John Wiley & Sons: Hoboken, NJ, USA, 2020.

Singh, H.R.; Rabi, S. Study of morphological variations of liver in human. Transl. Res. Anat. 2019, 14, 1–5.

Razavi, H. Global epidemiology of viral hepatitis. Gastroenterol. Clin. 2020, 49, 179–189.

Seitz, H.K., Bataller, R., Cortez-Pinto, H. et al. Alcoholic liver disease. Nat Rev Dis Primers 4, 16 (2018).

Powell, E.E.;Wong, V.W.S.; Rinella, M. Non-alcoholic fatty liver disease. Lancet 2021, 397, 2212–2224.

Ringehan, M.; McKeating, J.A.; Protzer, U. Viral hepatitis and liver cancer. Philos. Trans. R. Soc. B Biol. Sci. 2017, 372, 20160274.

Smith, A.; Baumgartner, K.; Bositis, C. Cirrhosis: Diagnosis and management. Am. Fam. Physician 2019, 100, 759–770.

Yuen, M.F.; Chen, D.S.; Dusheiko, G.M.; Janssen, H.L.; Lau, D.T.; Locarnini, S.A.; Peters, M.G.; Lai, C.L. Hepatitis B virus infection.Nat. Rev. Dis. Prim. 2018, 4, 1–20.

Manns, M.P.; Buti, M.; Gane, E.; Pawlotsky, J.M.; Razavi, H.; Terrault, N.; Younossi, Z. Hepatitis C virus infection. Nat. Rev. Dis.Prim. 2017, 3, 1–19.

Mentha, N.; Clément, S.; Negro, F.; Alfaiate, D. A review on hepatitis D: From virology to new therapies. J. Adv. Res. 2019,17, 3–15.

Kamar, N.; Izopet, J.; Pavio, N.; Aggarwal, R.; Labrique, A.;Wedemeyer, H.; Dalton, H.R. Hepatitis E virus infection. Nat. Rev.Dis. Prim. 2017, 3, 1–16.

Marchesini, G.; Moscatiello, S.; Di Domizio, S.; Forlani, G. Obesity-associated liver disease. J. Clin. Endocrinol. Metab. 2008,93, s74–s80.

Seitz, H.K.; Bataller, R.; Cortez-Pinto, H.; Gao, B.; Gual, A.; Lackner, C.; Mathurin, P.; Mueller, S.; Szabo, G.; Tsukamoto, H.Alcoholic liver disease. Nat. Rev. Dis. Prim. 2018, 4, 1–22.

Åberg, F.; Färkkilä, M. Drinking and obesity: Alcoholic liver disease/nonalcoholic fatty liver disease interactions. In Seminars in Liver Disease; Thieme Medical Publishers: New York, NY, USA, 2020; Volume 40, pp. 154–162.

Bae, M.; Park, Y.K.; Lee, J.Y. Food components with antifibrotic activity and implications in prevention of liver disease. J. Nutr.Biochem. 2018, 55, 1–11.

Cai, J.; Zhang, X.J.; Li, H. Progress and challenges in the prevention and control of nonalcoholic fatty liver disease. Med. Res. Rev.2019, 39, 328–348.

Fazakis, N.; Kocsis, O.; Dritsas, E.; Alexiou, S.; Fakotakis, N.; Moustakas, K. Machine learning tools for long-term type 2 diabetes risk prediction. IEEE Access 2021, 9, 103737–103757.

Dritsas, E.; Trigka, M. Data-Driven Machine-Learning Methods for Diabetes Risk Prediction. Sensors 2022, 22, 5304.

Alexiou, S.; Dritsas, E.; Kocsis, O.; Moustakas, K.; Fakotakis, N. An approach for Personalized Continuous Glucose Prediction with Regression Trees. In Proceedings of the 2021 6th South-East Europe Design Automation, Computer Engineering, ComputerNetworks and Social Media Conference (SEEDA-CECNSM), Preveza, Greece, 24–26 September 2021; pp. 1–6.

Fazakis, N.; Dritsas, E.; Kocsis, O.; Fakotakis, N.; Moustakas, K. Long-Term Cholesterol Risk Prediction with Machine Learning Techniques in ELSA Database. In Proceedings of the 13th International Joint Conference on Computational Intelligence (IJCCI), Online, 24–26 October 2021; pp. 445–450.

Dritsas, E.; Fazakis, N.; Kocsis, O.; Fakotakis, N.; Moustakas, K. Long-Term Hypertension Risk Prediction with ML Techniques in ELSA Database. In Proceedings of the International Conference on Learning and Intelligent Optimization, Athens, Greece, 20–25 June 2021; pp. 113–120.

Dritsas, E.; Trigka, M. Machine Learning Methods for Hypercholesterolemia Long-Term Risk Prediction. Sensors 2022, 22, 5365.

Dritsas, E.; Alexiou, S.; Moustakas, K. COPD Severity Prediction in Elderly with ML Techniques. In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, Corfu Island, Greece, 29 June–1 July 2022; pp. 185–189.

Dritsas, E.; Trigka, M. Supervised Machine Learning Models to Identify Early-Stage Symptoms of SARS-CoV-2. Sensors 2023,3, 40.

Dritsas, E.; Trigka, M. Stroke Risk Prediction with Machine Learning Techniques.

Dritsas, E.; Trigka, M. Machine Learning Techniques for Chronic Kidney Disease Risk Prediction. Big Data Cogn. Comput. 2022, 6, 98.

Dritsas, E.; Trigka, M. Lung Cancer Risk Prediction with Machine Learning Models. Big Data Cogn. Comput. 2022, 6, 1

Konstantoulas, I.; Kocsis, O.; Dritsas, E.; Fakotakis, N.; Moustakas, K. Sleep Quality Monitoring with Human Assisted Corrections. In Proceedings of the International Joint Conference on Computational Intelligence (IJCCI), Online, 24–26 October 2021; pp. 435–444.

Dritsas, E.; Alexiou, S.; Moustakas, K. Cardiovascular Disease Risk Prediction with Supervised Machine Learning Techniques. In Proceedings of the ICT4AWE, Online, 23–25 April 2022; pp. 315–321.

Indian Liver Patient Records. Available online: (accessed on 14 November 2022).

Lin, H.; Yip, T.C.F.; Zhang, X.; Li, G.; Tse, Y.K.; Hui, V.W.K.; Liang, L.Y.; Lai, J.C.T.; Chan, S.L.; Chan, H.L.Y.; et al. Age and the relative importance of liver-related deaths in nonalcoholic fatty liver disease. Hepatology 2022.

Mauvais-Jarvis, F.; Merz, N.B.; Barnes, P.J.; Brinton, R.D.; Carrero, J.J.; DeMeo, D.L.; De Vries, G.J.; Epperson, C.N.; Govindan, R.; Klein, S.L.; et al. Sex and gender: Modifiers of health, disease, and medicine. Lancet 2020, 396, 565–582.

Ruiz, A.R.G.; Crespo, J.; Martínez, R.M.L.; Iruzubieta, P.; Mercadal, G.C.; Garcés, M.L.; Lavin, B.; Ruiz, M.M. Measurement, and clinical usefulness of bilirubin in liver disease. Adv. Lab. Med. Med. Lab. 2021, 2, 352–361.

Liu, Y.; Cavallaro, P.M.; Kim, B.M.; Liu, T.; Wang, H.; Kühn, F.; Adiliaghdam, F.; Liu, E.; Vasan, R.; Samarbafzadeh, E.; et al. A role for intestinal alkaline phosphatase in preventing liver fibrosis. Theranostics 2021, 11, 14.

Goodarzi, R.; Sabzian, K.; Shishehbor, F.; Mansoori, A. Does turmeric/curcumin supplementation improve serum alanine aminotransferase and aspartate aminotransferase levels in patients with nonalcoholic fatty liver disease? A systematic review and meta-analysis of randomized controlled trials. Phytother. Res. 2019, 33, 561–570.

He, B.; Shi, J.;Wang, X.; Jiang, H.; Zhu, H.J. Genome-wide pQTL analysis of protein expression regulatory networks in the human liver. BMC Biol. 2020, 18, 1–16.

Carvalho, J.R.; Machado, M.V. New insights about albumin and liver disease. Ann. Hepatol. 2018, 17, 547–560.

Ye, Y.; Chen, W.; Gu, M.; Xian, G.; Pan, B.; Zheng, L.; Zhang, Z.; Sheng, P. Serum globulin and albumin to globulin ratio as potential diagnostic biomarkers for periprosthetic joint infection: A retrospective review. J. Orthop. Surg. Res. 2020, 15, 1–7.

Indian liver Patient Records. available online:

Maldonado, S.; López, J.; Vairetti, C. An alternative SMOTE oversampling strategy for high-dimensional datasets. Appl. Soft Comput. 2019, 76, 380–389.

Dritsas, E.; Fazakis, N.; Kocsis, O.; Moustakas, K.; Fakotakis, N. Optimal Team Pairing of Elder Office Employees with Machine Learning on Synthetic Data. In Proceedings of the 2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA), Chania Crete, Greece, 12–14 July 2021; pp. 1–4.

Jain, D.; Singh, V. Attribute selection and classification systems for chronic disease prediction: A review. Egypt. Inform. J. 2018, 19, 179–189.

Nusinovici, S.; Tham, Y.C.; Yan, M.Y.C.; Ting, D.S.W.; Li, J.; Sabanayagam, C.; Wong, T.Y.; Cheng, C.Y. Logistic regression was as good as machine learning for predicting major chronic diseases. J. Clin. Epidemiol. 2020, 122, 56–69.

Ghosh, S.; Dasgupta, A.; Swetapadma, A. A study on support vector machine based linear and non-linear pattern classification. In Proceedings of the 2019 International Conference on Intelligent Sustainable Systems (ICISS), Palladam, India, 21–22 February 2019; pp. 24–28.

Nahar, Nazmun, and Ferdous Ara. "Liver disease prediction by using different decision tree techniques." International Journal of Data Mining & Knowledge Management Process 8.2 (2018): 01-09.

Cunningham, P.; Delany, S.J. k-Nearest neighbour classifiers-A Tutorial. ACM Comput. Surv. (CSUR) 2021, 54, 1–25.

Dong, X.; Yu, Z.; Cao,W.; Shi, Y.; Ma, Q. A survey on ensemble learning. Front. Comput. Sci. 2020, 14, 241–258.

Palimkar, P.; Shaw, R.N.; Ghosh, A. Machine learning technique to prognosis diabetes disease: RRandom Forest classifier approach. In Advanced Computing and Intelligent Technologies; Springer: Berlin/Heidelberg, Germany, 2022; pp. 219–244.

González, S.; García, S.; Del Ser, J.; Rokach, L.; Herrera, F. A practical tutorial on bagging and boosting based ensembles for machine learning: Algorithms, software tools, performance study, practical perspectives, and opportunities. Inf. Fusion 2020, 64, 205–237.

Handelman, G.S.; Kok, H.K.; Chandra, R.V.; Razavi, A.H.; Huang, S.; Brooks, M.; Lee, M.J.; Asadi, H. Peering into the black box of artificial intelligence: Evaluation metrics of machine learning methods. Am. J. Roentgenol. 2019, 212, 38–43.

Zhou, J.; Gandomi, A.H.; Chen, F.; Holzinger, A. Evaluating the quality of machine learning explanations: A survey on methods and metrics. Electronics 2021, 10, 593.

Swapna, K.; Prasad Babu, M. Critical analysis of Indian liver patient’s dataset using ANOVA method. Int. J. Eng. Technol 2017,7, 19–33.

Gulia, A.; Vohra, R.; Rani, P. Liver patient classification using intelligent techniques. Int. J. Comput. Sci. Inf. Technol. 2014,5, 5110–5115.

Kumar, P.; Thakur, R.S. Early detection of the liver disorder from imbalance liver function test datasets. Int. J. Innov. Technol.Explor. Eng. 2019, 8, 179–186.

Jin, H.; Kim, S.; Kim, J. Decision factors on effective liver patient data prediction. Int. J. Bio-Sci. Bio-Technol. 2014, 6, 167–178.

Rahman, A.S.; Shamrat, F.J.M.; Tasnim, Z.; Roy, J.; Hossain, S.A. A comparative study on liver disease prediction using supervised machine learning algorithms. Int. J. Sci. Technol. Res. 2019, 8, 419–422.

M. Abdar, N. Y. Yen, and J. C.-S. Hung, "Improving the Diagnosis of Liver Disease Using Multilayer Perceptron Neural Network and Boosted Decision Trees," Journal of Medical and Biological Engineering, pp. 1-13, 2017.

Geetha, C.; Arunachalam, A. Evaluation based Approaches for Liver Disease Prediction using Machine Learning Algorithms. In Proceedings of the 2021 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India,27–29 January 2021; pp. 1–4.

H. He and E. A. Garcia, "Learning from imbalanced data," IEEE Transactions on knowledge and data engineering, vol. 21, no. 9, pp. 1263-1284, 2009.

Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP: SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 2002, 16:341–378.

Dritsas E, Trigka M. Supervised Machine Learning Models for Liver Disease Risk Prediction. Computers. 2023;12(1):19.

I. Hanif and M. M. Khan, "Liver Cirrhosis Prediction using Machine Learning Approaches," 2022 IEEE 13th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, NY, USA, 2022, pp. 0028-0034, doi: 10.1109/UEMCON54665.2022.9965718.

Sachdeva, R.K., Bathla, P., Rani, P. et al. A systematic method for diagnosis of hepatitis disease using machine learning. Innovations Syst Softw Eng 19, 71–80 (2023).

H. S. Yadav and R. K. Singhal, "Classification and Prediction of Liver Disease Diagnosis Using Machine Learning Algorithms," 2023 2nd International Conference for Innovation in Technology (INOCON), Bangalore, India, 2023, pp.1-6, doi: 10.1109/INOCON57975.2023.10101221.




How to Cite

Makkena KR, Natarajan K. Classification Algorithms for Liver Epidemic Identification. EAI Endorsed Trans Perv Health Tech [Internet]. 2023 Nov. 13 [cited 2023 Dec. 10];9. Available from: