Enhancing Diabetes Prediction with Data Preprocessing and various Machine Learning Algorithms
DOI:
https://doi.org/10.4108/eetiot.5348Keywords:
Accuracy, Diabetes, Machine Learning, Naive Bayes, Random ForestAbstract
Diabetes mellitus, usually called diabetes, is a serious public health issue that is spreading like an epidemic around the world. It is a condition that results in elevated glucose levels in the blood. India is often referred to as the 'Diabetes Capital of the World', due to the country's 17% share of the global diabetes population. It is estimated that 77 million Indians over the age of 18 have diabetes (i.e., everyone in eleven) and there are also an estimated 25 million pre-diabetics. One of the solutions to control diabetes growth is to detect it at an early stage which can lead to improved treatment. So, in this project, we are using a few machine learning algorithms like SVM, Decision Tree Classifier, Random Forest, KNN, Linear regression, Logistic regression, Naive Bayes to effectively predict the diabetes. Pima Indians Diabetes Database has been used in this project. According to the experimental findings, Random Forest produced an accuracy of 91.10% which is higher among the different algorithms used.
Downloads
References
Kharroubi, A. T., & Darwish, H. M. (2015). Diabetes mellitus: The epidemic of the century. World journal of diabetes, 6(6), 850–867. https://doi.org/10.4239/wjd.v6.i6.850 DOI: https://doi.org/10.4239/wjd.v6.i6.850
American Diabetes Association (2010). Diagnosis and classification of diabetes mellitus. Diabetes care, 33 Suppl 1(Suppl 1), S62–S69. https://doi.org/10.2337/dc10-S062 DOI: https://doi.org/10.2337/dc10-S062
Rabie, O., Alghazzawi, D., Asghar, J., Saddozai, F. K., & Asghar, M. Z. (2022). A Decision Support System for Diagnosing Diabetes Using Deep Neural Network. Frontiers in public health, 10, 861062. https://doi.org/10.3389/fpubh.2022.861062 DOI: https://doi.org/10.3389/fpubh.2022.861062
Alluri, R. P., & Hemavathy, R. (2021). Diabetes Prediction Using Ensemble Techniques. International Journal of Applied Engineering Research, 16(5), 410-415. Retrieved from http://www.ripublication.com https://www.ripublication.com/ijaer21/ijaerv16n5_12.pdf
Salliah Shafi Bhat, Venkatesan Selvam, Gufran Ahmad Ansari, Mohd Dilshad Ansari, Md Habibur Rahman, and Mamoon Rashid. 2022. Prevalence and Early Prediction of Diabetes Using Machine Learning in North Kashmir: A Case Study of District Bandi-pora. Intell. Neuroscience 2022 (2022). https://doi.org/10.1155/2022/2789760 DOI: https://doi.org/10.1155/2022/2789760
Siri, Adel & Ullah, Syed Sajid. (2021). An Improved Artificial Neural Network Model for Effective Diabetes Prediction. Complexity. 2021. 1-10. 10.1155/2021/5525271. DOI: https://doi.org/10.1155/2021/5525271
Xue, Jingyu & Min, Fanchao & Ma, Fengying. (2020). Research on Diabetes Prediction Method Based on Machine Learning. Journal of Physics: Conference Series. 1684. 012062. 10.1088/1742-6596/1684/1/012062. DOI: https://doi.org/10.1088/1742-6596/1684/1/012062
Yousef K. Qawqzeh, Abdullah S. Bajahzar, Mahdi Jemmali, Mohammad Mahmood Otoom, Adel Thaljaoui, "Classification of Diabetes Using Photoplethysmogram (PPG) Waveform Analysis: Logistic Regression Modeling", BioMed Research International, vol. 2020, Article ID 3764653, 6 pages, 2020. https://doi.org/10.1155/2020/3764653 DOI: https://doi.org/10.1155/2020/3764653
G. A. Pethunachiyar, "Classification of Diabetes Patients Using Kernel Based Support Vector Machines," 2020 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2020, pp. 1-4, doi: 10.1109/ICCCI48352.2020.9104185. DOI: https://doi.org/10.1109/ICCCI48352.2020.9104185
M. F. Faruque, Asaduzzaman and I. H. Sarker, "Performance Analysis of Machine Learning Techniques to Predict Diabetes Mellitus," 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), Cox'sBazar, Bangla-desh, 2019, pp. 1-4, doi: 10.1109/ECACE.2019.8679365. DOI: https://doi.org/10.1109/ECACE.2019.8679365
Zou, Q., Qu, K., Luo, Y., Yin, D., Ju, Y., & Tang, H. (2018). Predicting Diabetes Mellitus with Machine Learning Techniques. Frontiers in genetics, 9, 515. https://doi.org/10.3389/fgene.2018.00515 DOI: https://doi.org/10.3389/fgene.2018.00515
Jegan, Chitra. (2013). Classification Of Diabetes Disease Using Support Vector Machine. International Journal of Engineering Research and Applications. 3. 1797 - 1801.
Zhang Z. (2016). Introduction to machine learning: k-nearest neighbors. Annals of translational medicine, 4(11), 218. https://doi.org/10.21037/atm.2016.03.37 DOI: https://doi.org/10.21037/atm.2016.03.37
Shafi, Salliah and Ansari, Gufran Ahmad, Early Prediction of Diabetes Disease & Classification of Algorithms Using Machine Learning Approach (May 25, 2021). Proceedings of the International Conference on Smart Data Intelligence (ICSMDI 2021), Available at SSRN: https://ssrn.com/abstract=3852590 or http://dx.doi.org/10.2139/ssrn.3852590 DOI: https://doi.org/10.2139/ssrn.3852590
AlZu’bi S, Elbes M, Mughaid A, Bdair N, Abualigah L, Forestiero A, Zitar RA. Diabetes Monitoring System in Smart Health Cities Based on Big Data Intelligence. Future Internet. 2023; 15(2):85. https://doi.org/10.3390/fi15020085 DOI: https://doi.org/10.3390/fi15020085
Khanam, Jobeda Jamal & Foo, Simon. (2021). A comparison of machine learning algorithms for diabetes prediction. ICT Express. 7. 10.1016/j.icte.2021.02.004. DOI: https://doi.org/10.1016/j.icte.2021.02.004
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 EAI Endorsed Transactions on Internet of Things
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 3.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.